Puppeteer

Can a har file be programmatically generated from headless chrome using Puppeteer?

自古美人都是妖i 提交于 2019-12-05 14:15:12
I would like to control a headless chrome instance using puppeteer, taking snapshots and clicking on various page elements, while capturing a har file. Is this possible? I have looked at the API but haven't found anything useful. There is no HAR generator helper in Puppeteer. But you can use chrome-har to generate HAR file. const fs = require('fs'); const { promisify } = require('util'); const puppeteer = require('puppeteer'); const { harFromMessages } = require('chrome-har'); // list of events for converting to HAR const events = []; // event types to observe const observe = [ 'Page

Refresh when an element changes on page

戏子无情 提交于 2019-12-05 10:15:20
问题 I try to scrap an element on a website and display it on localhost with Puppeteer (1). But when this element changes, I would like to refresh data without opening a new browser/page with Puppeteer and only when element changes (2). For my example, I use www.timeanddate.com and the element is time (hours and minutes). For moment, only first part works. I don't have solution for second one. Please find below, my code. app.js var app = require('express')(); var server = require('http')

npm install puppeteer showing permission denied errors

帅比萌擦擦* 提交于 2019-12-05 09:34:28
I'm unable to install puppeteer as a project dependency, and I've tried re-installing node. Anyone have an idea on how to fix this? Running Ubuntu 17.10 x64 sudo apt-get purge nodejs; curl -sL https://deb.nodesource.com/setup_8.x | sudo -E bash -; apt-get install -y nodejs; sudo npm install -g n; sudo n stable; Node versions: $ node -v v9.4.0 $ npm -v 5.6.0 I try to install: root@server:/var/www/html# npm install --save puppeteer Error message: > puppeteer@1.1.0 install /var/www/html/node_modules/puppeteer > node install.js ERROR: Failed to download Chromium r536395! Set "PUPPETEER_SKIP

Looping through links(stories) and taking screenshots

微笑、不失礼 提交于 2019-12-05 08:28:56
What I'm trying to do here is loop through Storybook stories so I can perform visual regression testing on them: const puppeteer = require('puppeteer'); const { toMatchImageSnapshot } = require('jest-image-snapshot'); expect.extend({ toMatchImageSnapshot }); test('no visual regression for button', async () => { const selector = 'a[href*="?selectedKind=Buttons&selectedStory="]'; const browser = await puppeteer.launch({headless:false, slowMo: 350}); const page = await browser.newPage(); await page.goto('http://localhost:8080'); const storyLinks = await page.evaluate(() => { const stories = Array

Communicate “out” from Chromium via DevTools protocol

三世轮回 提交于 2019-12-05 05:52:25
I have a page running in a headless Chromium instance, and I'm manipulating it via the DevTools protocol, using the Puppeteer NPM package in Node. I'm injecting a script into the page. At some point, I want the script to call me back and send me some information (via some event exposed by the DevTools protocol or some other means). What is the best way to do this? It'd be great if it can be done using Puppeteer, but I'm not against getting my hands dirty and listening for protocol messages by hand. I know I can sort-of do this by manipulating the DOM and listening to DOM changes, but that

Puppeteer | Wait for all JavaScript is executed

空扰寡人 提交于 2019-12-05 04:38:22
I try to take screenshots from multiple pages, which should be fully loaded (including lazy loaded images) for later comparison. I found the lazyimages_without_scroll_events.js example which helps a lot. With the following code the screenshots are looking fine, but there is some major issue. async function takeScreenshot(browser, viewport, route) { return browser.newPage().then(async (page) => { const fileName = `${viewport.directory}/${getFilename(route)}`; await page.setViewport({ width: viewport.width, height: 500, }); await page.goto( `${config.server.master}${route}.html`, { waitUntil:

Want to scrape table using puppeteer.js. How can I get all rows, iterate through rows and then get “td's” for each row

余生颓废 提交于 2019-12-05 04:32:23
I have puppeteer js setup and was able get all rows using let rows = await page.$$eval('#myTable tr', row => row); Now I want for each row to get "td's" and then get inner text from those. Basically I want to do this: var tds = myRow.querySelectorAll("td"); where myRow is a table row, with puppeteer.js One way to achieve this is to use evaluate that first gets an array of all the TD's then returns the textContent of each TD const puppeteer = require('puppeteer'); const html = ` <html> <body> <table> <tr><td>One</td><td>Two</td></tr> <tr><td>Three</td><td>Four</td></tr> </table> </body> </html>

How to inject mutationobserver to puppeteer

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-05 02:53:05
问题 I want trace changed DOM like mutationobserver in headless chrome. So I learning puppeteer library, but don’t know how to use do that. It’s possible to trace DOM change in puppeteer?? thanks 回答1: Well,you can inject custom code to the browser. One way: await page.evaluate(() => { const observer = new MutationObserver( function() { // communicate with node through console.log method console.log('__mutation') } ) const config = { attributes: true, childList: true, characterData: true, subtree:

puppeteer - how to set download location

不羁的心 提交于 2019-12-05 02:11:19
I was able to successfully download a file with puppeteer, but it was just saving it to my /Downloads folder. I've been looking around and can't find anything in the api or forums to set this location. My downloads are basically just go going to the link: await page.goto(url); This is how you can set the download path in latest puppeteer v0.13. await page._client.send('Page.setDownloadBehavior', {behavior: 'allow', downloadPath: './myAwesomeDownloadFolder'}); The behaviour is experimental, it might be removed, modified, or changed later. Pst, you can try more tricks listed here , on your own

Puppeteer wait until page is completely loaded

半腔热情 提交于 2019-12-05 01:40:35
问题 I am working on creating PDF from web page. The application on which I am working is single page application. I tried many options and suggestion on https://github.com/GoogleChrome/puppeteer/issues/1412 But it is not working const browser = await puppeteer.launch({ executablePath: 'C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe', ignoreHTTPSErrors: true, headless: true, devtools: false, args: ['--no-sandbox', '--disable-setuid-sandbox'] }); const page = await browser.newPage