Selecting href attributers with Puppeteer

扶醉桌前 提交于 2019-12-23 20:43:21

问题


I am trying to extract a few urls from this page with Puppeteer.

However all my script is returning is undefined

const puppeteer = require('puppeteer');

async function run() {

    const browser = await puppeteer.launch({args: ['--no-sandbox', '--disable-setuid-sandbox']});

    const page = await browser.newPage();

    await page.goto('https://divisare.com/');


    let projects = await page.evaluate((sel) => {

        return document.getElementsByClassName(sel)
    }, 'homepage-project-image');


    var aNode = projects[0].href;

    console.log(aNode);
    console.log(projects.length)



  browser.close();

}
run();

However when I run something like the below I am at least able to get the proper count of the links I am trying to extract.

let projects = await page.evaluate((sel) => {

    return document.getElementsByClassName(sel).length
}, 'homepage-project-image');


console.log(projects);

Am I trying to access my projects HTMLCollection incorrectly? What am I missing here? Thanks.


回答1:


Puppeteer cannot return non-serialisable value from evaluate statement (see this issue and the following PR)

One way to solve this would be:

let projects = await page.evaluate((sel) => {

        return document.getElementsByClassName(sel)[0].href;
    }, 'homepage-project-image');

Remember that document.getElementsByClassName returns HTMLCollection, so if you want to iterate over the results you need something like:

 let projects = await page.evaluate((sel) => {
            return Array.from(document.getElementsByClassName(sel)).map(node => node.href);
        }, 'homepage-project-image');


来源:https://stackoverflow.com/questions/50147199/selecting-href-attributers-with-puppeteer

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!