puppeteer wait for page/DOM updates - respond to new items that are added after initial loading

有些话、适合烂在心里 提交于 2019-12-10 10:49:57

问题


I want to use Puppeteer to respond to page updates. The page shows items and when I leave the page open new items can appear over time. E.g. every 10 seconds a new item is added.

I can use the following to wait for an item on the initial load of the page:

await page.waitFor(".item");
console.log("the initial items have been loaded")

How can I wait for / catch future items? I would like to achieve something like this (pseudo code):

await page.goto('http://mysite');
await page.waitFor(".item");
// check items (=these initial items)

// event when receiving new items:
// check item(s) (= the additional [or all] items)

回答1:


You can use exposeFunction to expose a local function:

await page.exposeFunction('getItem', function(a) {
    console.log(a);
});

Then you can use page.evaluate to create an observer and listen to new nodes created inside a parent node.

This example scrapes (it's just an idea, not a final work) the python chat in Stack Overflow, and prints new items being created in that chat.

var baseurl =  'https://chat.stackoverflow.com/rooms/6/python';
const browser = await puppeteer.launch({headless: false});
const page = await browser.newPage();
await page.goto(baseurl);

await page.exposeFunction('getItem', function(a) {
    console.log(a);
});

await page.evaluate(() => {
    var observer = new MutationObserver((mutations) => { 
        for(var mutation of mutations) {
            if(mutation.addedNodes.length) {
                getItem(mutation.addedNodes[0].innerText);
            }
        }
    });
    observer.observe(document.getElementById("chat"), { attributes: false, childList: true, subtree: true });
});


来源:https://stackoverflow.com/questions/54109078/puppeteer-wait-for-page-dom-updates-respond-to-new-items-that-are-added-after

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!