Puppeteer | 易学教程

Puppeteer / Node.js to click a button as long as it exists — and when it no longer exists, commence action

阅读更多关于 Puppeteer / Node.js to click a button as long as it exists — and when it no longer exists, commence action

问题 There is a web page that contains many rows of data that are continually updated. There is a fixed number of rows, so old rows are cycled out and not stored anywhere. This page is broken up by a "load more" button that will appear until all of the stored rows are displayed on the page. I need to write a script in Puppeteer / Node.js that clicks that button until it no longer exists on the page... THEN ...read all the text on the page. (I have this part of the script finished.) I am new to

Is it possible to pass a React Component to puppeteer?

阅读更多关于 Is it possible to pass a React Component to puppeteer?

问题 I have a React component with some componentDidMount logic: export default class MyComponent { componentDidMount() { // some changes to DOM done here by a library } render() { return ( <div>{props.data}</div> ); } } Is it possible to pass this component with props so that everything in componentDidMount() gets executed, somehow to puppeteer in order to take a screenshot? Something along these lines: const browser = await puppeteer.launch({ headless: true }); const page = await browser.newPage

PuppeteerSharp无头浏览器.Net Sdk(Puppeteer)

阅读更多关于 PuppeteerSharp无头浏览器.Net Sdk(Puppeteer)

【推荐】2019 Java 开发者跳槽指南.pdf(吐血整理) >>> Puppeteer 首先我们需要了解下Puppeteer是谷歌官方出品的一个通过DevTools协议控制headless Chrome的NodeJS库。什么是无头浏览器？通俗点讲就是没有界面的浏览器。通过浏览器提供的API，进行调用，可以实现丰富的功能。网上有使用Puppeteer进行爬虫开发的案例。通过无头浏览器提供的这项功能，我们可以很方便的在服务端进行开发，完成一些复杂的网页交互。 Puppeteer支持导出图片（JPG、PNG）、PDF等。今天这里注重讲下基于Puppeteer用C#语言实现的PuppeteerSharp SDK。 PuppeteerSharp PuppeteerSharp基于NetStandard 2.0库，最低平台版本要求是.NET Framework 4.6.1和.NET Core 2.0。接下里，我们做一个示例程序，这里我新建一个.netcore2.0控制台应用程序，然后通过nuget添加对PuppeteerSharp的引用。然后写一段示例程序，这里我们以“OSC”首页为例，导出PDF。 static void Main(string[] args) { Test().Wait(); Console.WriteLine("Hello World!"); } static

Headless Chrome Node API and Puppeteer installation

阅读更多关于 Headless Chrome Node API and Puppeteer installation

问题 Throughout the process of installation chrome headless on a clean ubuntu 18.04 i faced quite a few issues. The setup guide on github is not sufficient for a clean ubuntu 18.04 The following are some errors and answer / solutions to setting up headless chrome an alternative to phantomjs. Error 1 (node:23835) UnhandledPromiseRejectionWarning: Error: Chromium revision is not downloaded. Run "npm install" or "yarn install" at Launcher.launch owlcommand.com /puppeteer/node_modules/puppeteer/lib

Pressing Enter button in puppeteer

阅读更多关于 Pressing Enter button in puppeteer

问题 Pressing enter in puppeteer doesn't seem to have any effect. However, when I press other keys, it does what it should. This works: await page.press('ArrowLeft'); This doesn't: await page.press('Enter'); This is how the input looks like: Any ideas? EDIT: I've also tried page.keyboard.down & page.keyboard.up to be sure. 回答1: await page.type(String.fromCharCode(13)); Using this site I noticed that page.type dispatches beforeinput and input events, but page.press doesn't. This is probably a bug,

Node.js puppeteer - Fetching content from a complex txt file

阅读更多关于 Node.js puppeteer - Fetching content from a complex txt file

问题 How can I download, access and process a complex txt file in puppeteer? I can access a xml file (Node.js puppeteer - Downloading/Accessing a xml file and process the content) like this: await page.goto(myPage, {waitUntil: 'load'}); const newPage = await page.evaluate(() => { var columns = document.getElementsByTagName("VALUEA"); var values = {"values":[]}; for(let f in columns){ values.values.push(columns[f].innerText); } return JSON.stringify(values); }); console.log(JSON.parse(newPage))

puppeteer : wait an element is visible?

阅读更多关于 puppeteer : wait an element is visible?

问题 I would like to know if I can tell puppeteer to wait until an element in displayed. const inputValidate = await page.$('input[value=validate]'); await inputValidate.click() //I want to do something like that waitElemenentVisble('.btnNext ') const btnNext = await page.$('.btnNext'); await btnNext.click(); Is there any way I can accomplish this? 回答1: I think you can use page.waitForSelector(selector[, options]) function for that purpose. const puppeteer = require('puppeteer'); puppeteer.launch(

Puppeteer - How to connect WSEndpoint using local IP address?

阅读更多关于 Puppeteer - How to connect WSEndpoint using local IP address?

问题 I have two Node.js scripts for puppeteer automation. 1) launcher.js This Puppeteer script launches a chrome browser and disconnects the chrome so that it can be connected by using WSEndpoint. const puppeteer = require('puppeteer'); module.exports = async () => { try { const options = { headless: false, devtools: false, ignoreHTTPSErrors: true, args: [ `--no-sandbox`, `--disable-setuid-sandbox`, `--ignore-certificate-errors` ] }; const browser = await puppeteer.launch(options); let pagesCount

Puppeteer detect when the new tab is opened

阅读更多关于 Puppeteer detect when the new tab is opened

问题 A very simple question. My web app opens a new tab under some conditions. But when I try to get all tabs (await browser.pages()) I get only one, initial page back. How can I get the new page's object in my code? This happens when you don't create new tab with puppeteer with await browser.newPage() , but when you do something like this: await (await browser.pages())[0].evaluate(() => { window.open('http://www.example.com', '_blank'); }); The page won't be available in the browser.pages()

Try Catch unable to catch UnhandledPromiseRejectionWarning

阅读更多关于 Try Catch unable to catch UnhandledPromiseRejectionWarning

问题 I thought I had a pretty good catch to find those rare timeouts that I get from puppeteer, but some how this timeout is not caught by any of them - my question is why? Here is the code: var readHtml = (url) => { return new Promise( async (resolve,reject)=> { var browser = await puppeteer.launch() var page = await browser.newPage() await page.waitForSelector('.allDataLoaded') .then(() => { console.log ("Finished reading: " + url) return resolve("COOL"); }) .catch((err) => { console.log (