Scraping 500k pages: works locally, blocked on EC2 how do you scale?
📰 Reddit r/learnprogramming
Hey folks, I’m working on a project where I need to collect reviews for around ~500k hotels. APIs (Google, Tripadvisor, etc.) are turning out to be quite expensive at this scale, so I’m exploring scraping as an alternative. Here’s my situation: I don’t need real-time data — even updating once every 1–2 months is fine I clearly Know when I run scraping locally, things work reasonably okay But when I move the same setup
DeepCamp AI