Managing puppeteer for memory and performance

前端 未结 3 458
天命终不由人
天命终不由人 2020-12-08 22:35

I\'m using puppeteer for scraping some pages, but I\'m curious about how to manage this in production for a node app. I\'ll be scraping up to 500,000 pages in a day, but the

3条回答
  •  刺人心
    刺人心 (楼主)
    2020-12-08 23:14

    If you are scraping 500,000 pages per day (approximately one page every 0.1728 seconds), then I would recommend opening a new page in an existing browser session rather than opening a new browser session for each page.

    You can open and close a Page using the following method:

    const page = await browser.newPage();
    await page.close();
    

    If you decide to use one Browser for your project, I would make sure to implement error handling procedures to ensure that if the program crashes, you have minimal downtime while you create a new Page, Browser, or BrowserContext.

提交回复
热议问题