How to reload page in Puppeteer?

送分小仙女□ 提交于 2019-12-10 17:54:49

问题


I would like to reload the page whenever the page doesn't load properly or encounters a problem. I tried page.reload() but it doesn't work.

for(const sect of sections ){

            // Now collect all the URLs
            const appUrls = await page.$$eval('div.main > ul.app-list > li > div.app-info a.app-info-icon', links => links.map(link => link.href));

            // Visit each URL one by one and collect the data
            for (let appUrl of appUrls) {
                var count = i++;
                try{
                    await page.goto(appUrl);
                    const appName = await page.$eval('div.det-name-int', div => div.innerText.trim());
                    console.log('\n' + count);
                    console.log(appName);
                } catch(e){
                    console.log('\n' + count);
                    console.log('ERROR', e);
                    await page.reload();
                }

            }

        }

It gives me this error:

    ERROR Error: Error: failed to find element matching selector "div.det-name-int"
    at ElementHandle.$eval (C:\Users\Administrator\node_modules\puppeteer\lib\JS
Handle.js:418:13)
    at process._tickCallback (internal/process/next_tick.js:68:7)
  -- ASYNC --
    at ElementHandle.<anonymous> (C:\Users\Administrator\node_modules\puppeteer\
lib\helper.js:108:27)
    at DOMWorld.$eval (C:\Users\Administrator\node_modules\puppeteer\lib\DOMWorl
d.js:149:21)
    at process._tickCallback (internal/process/next_tick.js:68:7)
  -- ASYNC --
    at Frame.<anonymous> (C:\Users\Administrator\node_modules\puppeteer\lib\help
er.js:108:27)
    at Page.$eval (C:\Users\Administrator\node_modules\puppeteer\lib\Page.js:329
:29)
    at Page.<anonymous> (C:\Users\Administrator\node_modules\puppeteer\lib\helpe
r.js:109:23)
    at main (C:\Users\Administrator\Desktop\webscrape\text.js:35:43)
    at process._tickCallback (internal/process/next_tick.js:68:7)

Some links are unable to load successfully. When I refresh those pages manually, it works. So I hope there is a function or a method that can help me reload my page automatically when there is an error.


回答1:


This works for me:

await page.reload({ waitUntil: ["networkidle0", "domcontentloaded"] });

See Puppeteer docs for details: https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagereloadoptions




回答2:


You always can reload page via DOM, like this:

await page.evaluate(() => {
   location.reload(true)
})

or here is a lot of ways how you can reload page with browser JS via DOM

Also, you can navigate your puppeteer back and forward. Like this:

await page.goBack();
await page.goForward();



回答3:


I manage to solve it using a while loop.

for (let appUrl of appUrls) {
    var count = i++;

    while(true){
        try{

            await page.goto(appUrl);

            const appName = await page.$eval('div.det-name-int', div => div.innerText.trim());

            console.log('\n' + count);
            console.log('Name: ' , appName);

            break;

            } catch(e){
              console.log('\n' + count);
              console.log('ERROR');
              await page.reload(appUrl);

              continue;
            }

}



回答4:


So after the comments, the following line makes the error.

ERROR Error: Error: failed to find element matching selector "div.det-name-int"

bacause Puppetteer has a browser callback. When it finds the element and calls the callback, and if the element doesn't exist it throws an error.

Also, the page is reloaded. You're not doing anything after that. If you want to fetch the image after that. Use

await page.$eval('div.det-name-int', div => div.innerText.trim());

after the reload. Or you can have a while loop to continuously check whether the element exists. If it doesn't then refresh page and check again. This ensures you will always have content.

But if your content is dynamically generated and not part of the DOM at the moment you read the page, then your code becomes useless. You might need to add a timeout then search the dom for the element.



来源:https://stackoverflow.com/questions/55236975/how-to-reload-page-in-puppeteer

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!