Puppeteer

nodejs puppeteer linux(centos)环境部署以及用puppeteer简单截图

两盒软妹~` 提交于 2019-11-30 00:55:59
nodejs puppeteer linux(centos)环境部署以及用puppeteer简单截图 1.安装Node环境 如果有安装Node请忽略第1点 #下载 cd /usr/local/src wget https://nodejs.org/dist/v10.15.3/node-v10.15.3-linux-x64.tar.xz #解压 tar -Jxf node-v10.15.3-linux-x64.tar.xz #将文件夹移动到 /usr/local/bin mv node-v10.15.3-linux-x64 /usr/local/bin/node-v10.15.3-linux-x64 #配置环境变量 vi /etc/profile 在"export PATH USER LOGNAME MAIL HOSTNAME HISTSIZE HISTCONTROL"上面加上 export NODE_HOME=/usr/local/bin/node-v10.15.3-linux-x64 export NODE_PATH=/usr/local/bin/node-v10.15.3-linux-x64/lib/node_modules export PATH=$PATH:$NODE_HOME/bin:$NODE_PATH #编译/etc/profile 使配置生效 source /etc

Can we somehow rename the file that is being downloaded using puppeteer?

夙愿已清 提交于 2019-11-29 16:44:42
I am downloading a file through puppeteer into my directory. I need to upload this file to an s3 bucket so I need to pick up the file name. But the problem is, this file name has a time stamp that changes every time so I can't keep a hard coded name. So is there a way around this to get a constant name every time (even if the old file is replaced), or how to rename the file being downloaded? I thought of using node's fs.rename() function but that would again require the current file name. I want a constant file name to hard code and then upload into the s3 bucket. await page._client.send('Page

How to run Puppeteer code in any web browser?

我只是一个虾纸丫 提交于 2019-11-29 16:41:02
I'm trying to do some web scraping with Puppeteer and I need to retrieve the value into a Website I'm building. I have tried to load the Puppeteer file in the html file as if it was a JavaScript file but I keep getting an error. However, if I run it in a cmd window it works well. Scraper.js: getPrice(); function getPrice() { const puppeteer = require('puppeteer'); void (async () => { try { const browser = await puppeteer.launch() const page = await browser.newPage() await page.goto('http://example.com') await page.setViewport({ width: 1920, height: 938 }) await page.waitForSelector('.m-hotel

Wait for text to appear when using puppeteer

穿精又带淫゛_ 提交于 2019-11-29 13:27:26
I wonder if there's a similar way as in selenium to wait for text to appear for a particular element. I've tried something like this but it doesn't seem to wait: await page.waitForSelector('.count', {visible: true}); You can use waitForFunction . See https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagewaitforfunctionpagefunction-options-args Including @elena's solution for completeness of the answer: await page.waitForFunction('document.querySelector(".count").inner‌​Text.length == 7'); Apart from the method presented in the answer from nilobarp, there are two more ways to do

Realtime scrap a chat using Nodejs

狂风中的少年 提交于 2019-11-29 11:45:19
What I want to do is to build a scrap application on NodeJs from which it m onitors on Realtime a chat and store certain messages within any database? What I am wanting to do is the following, I am wanting to capture data from the chat platforms streaming, and thus capture some useful information that helps those who are doing the streaming service; But I do not know how to start doing this using NodeJs, What I have been able to do so far has been to capture the data of the messages, however I can not monitor in realtime new messages, any help in this regard? What i did so far: server.js var

Puppeteer: Get inner HTML

丶灬走出姿态 提交于 2019-11-29 09:27:38
does anybody know how to get the innerHTML or text of an element. Or even better; how to click an element with a specific innerHTML. This is how it would work with normal javascript: var found = false $(selector).each(function() { if (found) return; else if ($(this).text().replace(/[^0-9]/g, '') === '5' { $(this).trigger('click'); found = true } Thanks in advance for any help! This is how i get innerHTML: page.$eval(selector, (element) => { return element.innerHTML }) E. Fortes This should work with puppeteer:) const page = await browser.newPage(); const title = await page.evaluate(el => el

Error while excuting chrome without headless on heroku

瘦欲@ 提交于 2019-11-29 08:56:31
I am currently working on project where I need to build an application that needs to open an URL in a browser in order to use some functions on it. for that I used puppeteer inside a nodejs script in order to open the browser on the server side so I can use it like an api . Here's the code (nodejs): app.get('/do', (req, res) => { console.log("ok"); (async() => { var browser = await puppeteer.launch( { args: ['--no-sandbox','--disable-setuid-sandbox'], headless: false }); var page = await browser.newPage(); await page.goto('https://url.com');//i hid the url for personal reason await page

Puppeteer: How to handle multiple tabs?

浪子不回头ぞ 提交于 2019-11-29 01:50:29
问题 Scenario: Web form for developer app registration with two part workflow. Page 1: Fill out developer app details and click on button to create Application ID, which opens, in a new tab... Page 2: The App ID page. I need to copy the App ID from this page, then close the tab and go back to Page 1 and fill in the App ID (saved from Page 2), then submit the form. I understand basic usage - how to open Page 1 and click the button which opens Page 2 - but how do I get a handle on Page 2 when it

how to manage log in session through headless chrome?

馋奶兔 提交于 2019-11-28 22:29:33
问题 I need to make scraper to: open headless browser, go to url, log in (there is steam oauth), fill some inputs, click 2 buttons problem is every new instance of headless browser clears my login session, and then i need to login again and again...how to save it through instances? for example using puppeteer with headless chrome or how can i open already logged in chrome headless instance? if i already log in in my main chrome window 回答1: In puppeter you have access to the session cookies through

Running puppeteer with containerized chrome binary from another container

两盒软妹~` 提交于 2019-11-28 14:24:26
I want my code using puppeteer running in one container and using (perhaps by "executablePath" launch param?) a chrome binary from another container. Is this possible? any known solution for that? Use case: worker code runs in multiple k8 pods (as containers) . "Sometime" (might be often or not often) worker needs to run code utilizing puppeteer. I don't want to make the docker gigantic and limited as the puppeteer/chrome container is (1.5 GB If I recall correctly) I just want my code to be supplied with the needed binary from another running container Notice: this is not a question about