Puppeteer

Puppeteer returns undefined for page.evaluate despite being resolved on devtools

泪湿孤枕 提交于 2021-01-05 09:04:29
问题 I have a simple code in which I expect to find the element that contains certain text as such await page.goto('https://www.reddit.com/r/koreanvariety/comments/hsdt4j/the_great_escape_season_3_e12_back_to_the/') await page.waitFor(2000); const findComment = await page.evaluate(() => { return Array.from(document.querySelectorAll('a')).find(el => el.textContent === 'sometext' ) }) console.log('findComment', findComment) And although the code above works on devtools it returns undefined in my

Multi browsers vs multi tabs in Puppeteer

萝らか妹 提交于 2021-01-01 04:56:06
问题 I have 100 web pages that I have to test for runtime errors. I found the Puppeteer plugin that can do that "no sweat", but I ran into one dilemma: have one launched browser with multiple tabs or new browser for each link. What is the best approach in this case? In case of multiple tabs, I heard, there is a chance that css animation and something else (can't remember now) would not work when tab is not in focus. Multiple browser, obviously, causes the higher CPU load (no?) 回答1: These are the

How do I sign into Google using Puppeteer?

非 Y 不嫁゛ 提交于 2020-12-31 14:53:30
问题 I am using Puppeteer and I am trying to sign into my Gmail account URL: https://accounts.google.com/ServiceLogin/identifier?service=mail&passive=true&rm=false&continue=https%3A%2F%2Fmail.google.com%2Fmail%2F&ss=1&scc=1&ltmpl=default&ltmplcache=2&emr=1&osid=1&flowName=GlifWebSignIn&flowEntry=AddSession Currently my code types into the email form and submits enter, then when the page goes to the password screen, there is not way to write in the input for password. This may be because it is

How to get body / json response from XHR request with Puppeteer

女生的网名这么多〃 提交于 2020-12-30 05:14:14
问题 I want to get the JSON data from a website I'm scraping with Puppeteer, but I can't figure how to get the body of the request back. Here's what I've tried: const puppeteer = require('puppeteer') const results = []; (async () => { const browser = await puppeteer.launch({ headless: false }) const page = await browser.newPage() await page.goto("https://capuk.org/i-want-help/courses/cap-money-course/introduction", { waitUntil: 'networkidle2' }); await page.type('#search-form > input[type="text"]'

How to get body / json response from XHR request with Puppeteer

纵然是瞬间 提交于 2020-12-30 05:12:30
问题 I want to get the JSON data from a website I'm scraping with Puppeteer, but I can't figure how to get the body of the request back. Here's what I've tried: const puppeteer = require('puppeteer') const results = []; (async () => { const browser = await puppeteer.launch({ headless: false }) const page = await browser.newPage() await page.goto("https://capuk.org/i-want-help/courses/cap-money-course/introduction", { waitUntil: 'networkidle2' }); await page.type('#search-form > input[type="text"]'

How to get body / json response from XHR request with Puppeteer

一笑奈何 提交于 2020-12-30 05:11:52
问题 I want to get the JSON data from a website I'm scraping with Puppeteer, but I can't figure how to get the body of the request back. Here's what I've tried: const puppeteer = require('puppeteer') const results = []; (async () => { const browser = await puppeteer.launch({ headless: false }) const page = await browser.newPage() await page.goto("https://capuk.org/i-want-help/courses/cap-money-course/introduction", { waitUntil: 'networkidle2' }); await page.type('#search-form > input[type="text"]'

How to get body / json response from XHR request with Puppeteer

巧了我就是萌 提交于 2020-12-30 05:11:38
问题 I want to get the JSON data from a website I'm scraping with Puppeteer, but I can't figure how to get the body of the request back. Here's what I've tried: const puppeteer = require('puppeteer') const results = []; (async () => { const browser = await puppeteer.launch({ headless: false }) const page = await browser.newPage() await page.goto("https://capuk.org/i-want-help/courses/cap-money-course/introduction", { waitUntil: 'networkidle2' }); await page.type('#search-form > input[type="text"]'

How to get all links from the DOM?

时光怂恿深爱的人放手 提交于 2020-12-29 06:51:22
问题 According to https://github.com/GoogleChrome/puppeteer/issues/628, I should be able to get all links from < a href="xyz" > with this single line: const hrefs = await page.$$eval('a', a => a.href); But when I try a simple: console.log(hrefs) I only get: http://example.de/index.html ... as output which means that it could only find 1 link? But the page definitely has 12 links in the source code / DOM. Why does it fail to find them all? Minimal example: 'use strict'; const puppeteer = require(

integrate puppeteer in gitlab with gitlab-ci.yml

こ雲淡風輕ζ 提交于 2020-12-29 05:32:43
问题 Im currently working on e2e test in Chrome Puppeteer. I am at the stage where it would be ideal to integrate my tests in the development process. What I want to accomplish is the following: my tests run automated before every deploy to production. If they succeed deployment goes through, if they fail deployment is canceled. I use a pipeline on gitlab to automate my deployment process. So my main question is how can I integrate my puppeteer tests into the gitlab-ci.yml file? 回答1: This might be

How can I download images on a page using puppeteer?

狂风中的少年 提交于 2020-12-29 03:00:10
问题 I'm new to web scraping and want to download all images on a webpage using puppeteer: const puppeteer = require('puppeteer'); let scrape = async () => { // Actual Scraping goes Here... const browser = await puppeteer.launch({headless: false}); const page = await browser.newPage(); await page.goto('https://memeculture69.tumblr.com/'); // Right click and save images }; scrape().then((value) => { console.log(value); // Success! }); I have looked at the API‌ docs but could not figure out how to