How to use xpath in chrome headless+puppeteer evaluate()?

后端 未结 2 1198
自闭症患者
自闭症患者 2020-12-09 05:26

How can I use $x() to use xpath expression inside a page.evaluate() ?

As far as page is not in the same context, I tried $x()

相关标签:
2条回答
  • 2020-12-09 05:49

    $x() is not a standard JavaScript method to select element by XPath. $x() it's only a helper in chrome devtools. They claim this in the documentation:

    Note: This API is only available from within the console itself. You cannot access the Command Line API from scripts on the page.

    And page.evaluate() is treated here as a "scripts on the page".

    You have two options:

    1. Use document.evaluate

    Here is a example of selecting element (featured article) inside page.evaluate():

    const puppeteer = require('puppeteer');
    
    (async () => {
        const browser = await puppeteer.launch();
        const page = await browser.newPage();
        await page.goto('https://en.wikipedia.org', { waitUntil: 'networkidle2' });
    
        const text = await page.evaluate(() => {
            // $x() is not a JS standard -
            // this is only sugar syntax in chrome devtools
            // use document.evaluate()
            const featureArticle = document
                .evaluate(
                    '//*[@id="mp-tfa"]',
                    document,
                    null,
                    XPathResult.FIRST_ORDERED_NODE_TYPE,
                    null
                )
                .singleNodeValue;
    
            return featureArticle.textContent;
        });
    
        console.log(text);
        await browser.close();
    })();
    
    1. Select element by Puppeteer page.$x() and pass it to page.evaluate()

    This example achieves the same results as in the 1. example:

    const puppeteer = require('puppeteer');
    
    (async () => {
        const browser = await puppeteer.launch();
        const page = await browser.newPage();
        await page.goto('https://en.wikipedia.org', { waitUntil: 'networkidle2' });
    
        // await page.$x() returns array of ElementHandle
        // we are only interested in the first element
        const featureArticle = (await page.$x('//*[@id="mp-tfa"]'))[0];
        // the same as:
        // const featureArticle = await page.$('#mp-tfa');
    
        const text = await page.evaluate(el => {
            // do what you want with featureArticle in page.evaluate
            return el.textContent;
        }, featureArticle);
    
        console.log(text);
        await browser.close();
    })();
    

    Here is a related question how to inject $x() helper function to your scripts.

    0 讨论(0)
  • 2020-12-09 05:52

    If you insist on using page.$x(), you can simply pass the result to page.evaluate():

    const example = await page.evaluate(element => {
      return element.textContent;
    }, (await page.$x('//*[@id="result"]'))[0]);
    
    0 讨论(0)
提交回复
热议问题