Puppeteer | 易学教程

Cloud functions timeout on page.goto()

阅读更多关于 Cloud functions timeout on page.goto()

问题 I run tests with puppeteer in cloud functions. If I run test on local machine all is fine. If I run tests in cloud functions emulator it's fine as well. But when I deploy my function to the cloud all tests stuck on page.goto('https://...') and function fails by timeout, which in my case is 3 minutes. 回答1: The problem was in puppeteer. I downgraded from the version 1.13.0 to 1.11.0 and now everything works fine. See the discussion here 来源： https://stackoverflow.com/questions/55274130/cloud

puppeteer wait for page/DOM updates - respond to new items that are added after initial loading

阅读更多关于 puppeteer wait for page/DOM updates - respond to new items that are added after initial loading

问题 I want to use Puppeteer to respond to page updates. The page shows items and when I leave the page open new items can appear over time. E.g. every 10 seconds a new item is added. I can use the following to wait for an item on the initial load of the page: await page.waitFor(".item"); console.log("the initial items have been loaded") How can I wait for / catch future items? I would like to achieve something like this (pseudo code): await page.goto('http://mysite'); await page.waitFor(".item");

Puppeteer - counting elements by class name

阅读更多关于 Puppeteer - counting elements by class name

问题 I am trying to get info about all the elements with a particular class name into an array. The problem is this is a dynamically generated HTML page, and as long as i scroll down, new elements of that class name appear. Fortunately, i know beforehand how many of these elements exist. So my hypothetical solution is to check the number of elements with that particular class name, and as long as that number is less than the one i know, keep scolling down. The problem is i don't know exactly how

Puppeteer | Wait for all JavaScript is executed

阅读更多关于 Puppeteer | Wait for all JavaScript is executed

问题 I try to take screenshots from multiple pages, which should be fully loaded (including lazy loaded images) for later comparison. I found the lazyimages_without_scroll_events.js example which helps a lot. With the following code the screenshots are looking fine, but there is some major issue. async function takeScreenshot(browser, viewport, route) { return browser.newPage().then(async (page) => { const fileName = `${viewport.directory}/${getFilename(route)}`; await page.setViewport({ width:

Serverless 实战 —— 快速开发一个分布式 Puppeteer 网页截图服务

阅读更多关于 Serverless 实战 —— 快速开发一个分布式 Puppeteer 网页截图服务

通俗描述就是：Puppeteer 可以将 Chrome 或者 Chromium 以无界面的方式运行（当然也可以运行在有界面的服务器上），然后可以通过代码控制浏览器的行为，即使是非界面的模式运行，Chrome 或 Chromium 也可以在内存中正确渲染网页的内容。那么 Puppeteer 能做什么呢？其实有很多地方都可以受用 Puppeteer，比如：生成网页截图或者 PDF 抓取 SPA（Single-Page Application) 进行服务器渲染（SSR）高级爬虫，可以爬取大量异步渲染内容的网页模拟键盘输入、表单自动提交、登录网页等，实现 UI 自动化测试捕获站点的时间线，以便追踪你的网站，帮助分析网站性能问题本文选择截图场景作为演示。如何快速部署一个分布式 Puppeteer Web 应用？为了快速部署分布式 Puppeteer Web 应用，这里我们选择函数计算服务。函数计算（Function Compute） : 函数计算是一个事件驱动的服务，通过函数计算，用户无需管理服务器等运行情况，只需编写代码并上传。函数计算准备计算资源，并以弹性伸缩的方式运行用户代码，而用户只需根据实际代码运行所消耗的资源进行付费。函数计算更多信息参考。有了函数计算服务，我们这里目标是搭建一个分布式应用，但做的事情其实很简单，那就是写好业务代码，部署到函数计算

how do POST request in puppeteer?

阅读更多关于 how do POST request in puppeteer?

问题 (async() => { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto('https://www.example.com/search'); const data = await page.content(); browser.close(); res.send(data); })(); I do this code for send get request. I don't understand how I should send post request? 回答1: Getting the "order" right can be a bit of a challenge. Documentation doesn't have that many examples... there are some juicy items in the repository in the example folder that you

Puppeteer-web: Protocol error (Target: getBrowserContexts) not allowed

阅读更多关于 Puppeteer-web: Protocol error (Target: getBrowserContexts) not allowed

I've got a chrome extension that is trying to implement puppeteer-web. I've followed the following code to try and get set up puppeteer-web: "Puppeteer is not a constructor" This is my code: const puppeteer = require("puppeteer"); async function initiatePuppeteer() { let browserWSEndpoint = ''; await fetch("http://127.0.0.1:9222/json") .then(response => response.json()) .then(function(data) { let filteredData = data.filter(tab => tab.type ==='page'); browserWSEndpoint = filteredData[0].webSocketDebuggerUrl; }) .catch(error => console.log(error)); const browser = await puppeteer.connect({

Puppeteer Crawler - Error: net::ERR_TUNNEL_CONNECTION_FAILED

阅读更多关于 Puppeteer Crawler - Error: net::ERR_TUNNEL_CONNECTION_FAILED

问题 Currently I have my Puppeteer running with a Proxy on Heroku. Locally the proxy relay works totally fine. I however get the error Error: net::ERR_TUNNEL_CONNECTION_FAILED. I've set all .env info in the Heroku config vars so they are all available. Any idea how I can fix this error and resolve the issue? I currently have const browser = await puppeteer.launch({ args: [ "--proxy-server=https=myproxy:myproxyport", "--no-sandbox", '--disable-gpu', "--disable-setuid-sandbox", ], timeout: 0,

Possible to get HTTP response headers with nodejs and puppeteer?

阅读更多关于 Possible to get HTTP response headers with nodejs and puppeteer?

问题 Hi there, is there any possible way to get the server information like the above by using nodejs and puppeteer ? many thanks 回答1: These are the response headers, which you can get with response.headers(): const response = await page.goto(url); const headers = response.headers(); console.log(headers); 来源： https://stackoverflow.com/questions/54935656/possible-to-get-http-response-headers-with-nodejs-and-puppeteer

Error: failed to find element matching selector for <img data-src="url>

阅读更多关于 Error: failed to find element matching selector for

问题 Running on Puppeteer, all updated. The intended process is to go to website, where url is url/{search item} and run through the list of search names. Then for each search item --> search page, get name, price and image url for each listing. Now theres error it cannot find selector. Appreciate any help on this, many thanks! Layout of the data of the website is as follows: <div class="items-box-content"> <section class="items-box"> <a href="https://listingurl"> <figure class="items-box-photo">