Puppeteer

puppeteer stream playing <video> to node.js buffer

别说谁变了你拦得住时间么 提交于 2019-12-02 10:32:30
using Puppeteer I'm able to navigate to a certain video src URL, and the MP4 (using a custom build of chronium) plays fine. NOW: I want to be able to get the video data that's playing and send it to some kind of buffer in node.js that can be saved as a file or sent to a client via a websocket or sent as a response etc.... but I'm not sure how to do it, all I have is the video playing. I'm not able to just send the URL over to node.js, because in order to view the video file you have to go through the whole puppeteer crawling process (it's not just a static URL, it's dependent on that browser

How to scrape javascript hash links content?

谁说胖子不能爱 提交于 2019-12-02 09:42:02
Hi im a bit new in web scraping using Puppeter im currently im facing the next problem: in the site where im trying to extract information i have a bootstrap table with a typical js pagination like the examples from: https://getbootstrap.com/docs/4.1/components/pagination/ when i check the page html with Chrome inspector all i can see is 2 and when i check link location i see https://webpage.com/works# how i can know how many pages are in total? and how i can click them? i don't understand how i can visit every page for this type of pagination. Thanks! There is no foolproof way, but I deal

Is there a way to add script to add new functions in evaluate() context of chrome+puppeeter?

僤鯓⒐⒋嵵緔 提交于 2019-12-02 07:10:18
Based on this response , is there a way (like with casperjs/phantomjs) to add our custom functions in page.evaluate() context ? By example, include a file with a helper function x to call an Xpath function : x('//a/@href') You can register helper functions in separate page.evaluate() function. page.exposeFunction() looks temptingly, but it don't have access to browser context (and you need document object). Here is an example of registering helper function with $x() : const puppeteer = require('puppeteer'); const helperFunctions = () => { window.$x = xPath => document .evaluate( xPath,

page.evaluate Vs. Puppeteer $ methods

谁都会走 提交于 2019-12-02 05:28:19
I'm interested in the differences of these two blocks of code. const $anchor = await page.$('a.buy-now'); const link = await $anchor.getProperty('href'); await $anchor.click(); await page.evaluate(() => { const $anchor = document.querySelector('a.buy-now'); const text = $anchor.href; $anchor.click(); }); I've generally found raw DOM elements in page.evaluate() easier to work and the ElementHandles returned by the $ methods an abstraction to far. However I felt perhaps that the async Puppeteer methods might be more performant or improve reliability? I couldn't find any guidance on this in the

Communicating between the main and renderer function in Puppeteer

…衆ロ難τιáo~ 提交于 2019-12-02 02:40:46
Is there a way to communicate between the main and renderer process in Puppeteer similar to the ipcMain and ipcRenderer functions in Electron . A simple application is demonstrated in this post . I find this functionality can be useful for debugging by triggering event from the page to the main function and vice-versa. Debugging: - Puppeteer has various page events used for debugging purpose here . - Puppeteer recently added ASYNC stack trace so you can track errors more precisely. Event emitting, You can use the default events module and exposeFunction to build your own events system. Refer

puppeteer: Access JSON response of a specific request as in the network tab of DevTools

穿精又带淫゛_ 提交于 2019-12-02 01:47:27
问题 I'd like to directly get the response of the last HTTP request shown in the screenshot. The current puppeteer code is shown below. Could anybody show me how to modify it so that it will get the JSON response directly from the browser? Thanks. const puppeteer = require('puppeteer'); (async () => { // const browser = await puppeteer.launch(); const browser = await puppeteer.launch({ headless: false , args: ['--user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML

puppeteer: Access JSON response of a specific request as in the network tab of DevTools

情到浓时终转凉″ 提交于 2019-12-01 22:50:57
I'd like to directly get the response of the last HTTP request shown in the screenshot. The current puppeteer code is shown below. Could anybody show me how to modify it so that it will get the JSON response directly from the browser? Thanks. const puppeteer = require('puppeteer'); (async () => { // const browser = await puppeteer.launch(); const browser = await puppeteer.launch({ headless: false , args: ['--user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3312.0 Safari/537.36"'] }); const page = await browser.newPage(); await page

Puppeteer - infinite scrolling situation

Deadly 提交于 2019-12-01 14:10:49
I wanted to keep scrolling down, until all the elements with a particular classname are loaded in a dynamic HTML environment. This is the code i used: while ((await page.$$('.xj7')).length < counter) { await page.evaluate(() => window.scrollBy(0, window.innerHeight)); } The problem is that after it loads all the elements, it doesn't stop scrolling. I don't know why is that, as it should exit the while loop. When i terminate the application, i get this error: (node:5708) UnhandledPromiseRejectionWarning: Error: Protocol error (Runtime.cal lFunctionOn): Session closed. Most likely the page has

Puppeteer - infinite scrolling situation

[亡魂溺海] 提交于 2019-12-01 12:45:45
问题 I wanted to keep scrolling down, until all the elements with a particular classname are loaded in a dynamic HTML environment. This is the code i used: while ((await page.$$('.xj7')).length < counter) { await page.evaluate(() => window.scrollBy(0, window.innerHeight)); } The problem is that after it loads all the elements, it doesn't stop scrolling. I don't know why is that, as it should exit the while loop. When i terminate the application, i get this error: (node:5708)

How to pass required module object to puppeteer page.evaluate

ぃ、小莉子 提交于 2019-12-01 11:46:53
Puppeteer version: 1.0.0 Platform / OS version: Windows 10 Node.js version: 8.9.3 Here is my code: const puppeteer = require('puppeteer'); const varname = require('varname'); ... const page = await browser.newPage(); await page.goto(url); let generalInfo = await page.evaluate(() => { let elements = Array.from(document.querySelectorAll('#order-details > table > tbody > tr')); let res = {}; elements.map((tr) => { let split = tr.innerText.trim().split('\t'); res[varname.camelback(split[0])] = split[1]; // Here is: ... Error: Evaluation failed: ReferenceError: varname is not defined }); return res