I\'m using puppeteer for scraping some pages, but I\'m curious about how to manage this in production for a node app. I\'ll be scraping up to 500,000 pages in a day, but the
If you are scraping 500,000 pages per day (approximately one page every 0.1728 seconds), then I would recommend opening a new page in an existing browser session rather than opening a new browser session for each page.
You can open and close a Page using the following method:
const page = await browser.newPage();
await page.close();
If you decide to use one Browser for your project, I would make sure to implement error handling procedures to ensure that if the program crashes, you have minimal downtime while you create a new Page, Browser, or BrowserContext.