Is there a way to get puppeteer's waitUntil “networkidle” to only consider XHR (ajax) requests?

人走茶凉 提交于 2019-12-07 12:29:34

问题


I am using puppeteer to evaluate the javascript-based HTML of web pages in my test app.

This is the line I am using to make sure all the data is loaded:

await page.setRequestInterception(true);
page.on("request", (request) => {
  if (request.resourceType() === "image" || request.resourceType() === "font" || request.resourceType() === "media") {
    console.log("Request intercepted! ", request.url(), request.resourceType());
    request.abort();
  } else {
    request.continue();
  }
});
try {
  await page.goto(url, { waitUntil: ['networkidle0', 'load'], timeout: requestCounterMaxWaitMs });
} catch (e) {

}

Is this the best way to wait for ajax requests to be completed?

It feels right but I'm not sure if I should use networkidle0, networkidle1, etc?


回答1:


You can use pending-xhr-puppeteer, a lib that expose a promise awaiting that all the pending xhr requests are resolved.

Use it like this :

const puppeteer = require('puppeteer');
const { PendingXHR } = require('pending-xhr-puppeteer');

const browser = await puppeteer.launch({
  headless: true,
  args,
});

const page = await browser.newPage();
const pendingXHR = new PendingXHR(page);
await page.goto(`http://page-with-xhr`);
// Here all xhr requests are not finished
await pendingXHR.waitForAllXhrFinished();
// Here all xhr requests are finished

DISCLAIMER: I am the maintener of pending-xhr-puppeteer




回答2:


XHR by their nature can appear later in the app. Any networkidle0 will not help you if app sends XHR after for example 1 second and you want to wait for it. I think if you want to do this "properly" you should know what requests you are waiting for and await for them.

Here is an example with XHRs occurred later in the app and it wait for all of them:

const puppeteer = require('puppeteer');

const html = `
<html>
  <body>
    <script>
      setTimeout(() => {
        fetch('https://swapi.co/api/people/1/');
      }, 1000);

      setTimeout(() => {
        fetch('https://www.metaweather.com/api/location/search/?query=san');
      }, 2000);

      setTimeout(() => {
        fetch('https://api.fda.gov/drug/event.json?limit=1');
      }, 3000);
    </script>
  </body>
</html>`;

// you can listen to part of the request
// in this example I'm waiting for all of them
const requests = [
    'https://swapi.co/api/people/1/',
    'https://www.metaweather.com/api/location/search/?query=san',
    'https://api.fda.gov/drug/event.json?limit=1'
];

const waitForRequests = (page, names) => {
  const requestsList = [...names];
  return new Promise(resolve =>
     page.on('request', request => {
       if (request.resourceType() === "xhr") {
         // check if request is in observed list
         const index = requestsList.indexOf(request.url());
         if (index > -1) {
           requestsList.splice(index, 1);
         }

         // if all request are fulfilled
         if (!requestsList.length) {
           resolve();
         }
       }
       request.continue();
     })
  );
};


(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.setRequestInterception(true);

  // register page.on('request') observables
  const observedRequests = waitForRequests(page, requests);

  // await is ignored here because you want to only consider XHR (ajax) 
  // but it's not necessary
  page.goto(`data:text/html,${html}`);

  console.log('before xhr');
  // await for all observed requests
  await observedRequests;
  console.log('after all xhr');
  await browser.close();
})();


来源:https://stackoverflow.com/questions/49538478/is-there-a-way-to-get-puppeteers-waituntil-networkidle-to-only-consider-xhr

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!