Puppeteer

Puppeteer with Brave browser?

无人久伴 提交于 2020-08-26 09:33:32
问题 I'm wondering if it's possible executing a puppeteer script using Brave browser instead of the basic version of chromium. I know that Brave has been developed from chromium, and for that reason you can launch a selenium script using brave, but do you know if it's possible with puppeteer as well? 回答1: Yes, you can use Brave. The only catch is the adblocking doesn't work with headless mode. For the adblocking in headful mode, you need to set/create a profile and point the userDataDir option to

Puppeteer with Brave browser?

不问归期 提交于 2020-08-26 09:32:48
问题 I'm wondering if it's possible executing a puppeteer script using Brave browser instead of the basic version of chromium. I know that Brave has been developed from chromium, and for that reason you can launch a selenium script using brave, but do you know if it's possible with puppeteer as well? 回答1: Yes, you can use Brave. The only catch is the adblocking doesn't work with headless mode. For the adblocking in headful mode, you need to set/create a profile and point the userDataDir option to

How to “hook in” puppeteer into a running Chrome instance/tab

天涯浪子 提交于 2020-08-26 06:17:06
问题 Is it somehow possible to attach puppeteer to a running Chrome instance (manually started browser) and then takeover control within a tab? I'm assuming that it's eventually related to start the Chrome browser using the --no-sandbox flag but don't know how to continue from there. Thanks for any help 回答1: You can use puppeteer.connect(options) (see here): const puppeteer = require('puppeteer'); const browserWSEndpoint = 'a browser websocket endpoint to connect to'; const browser = await

Getting the sibling of an elementHandle in Puppeteer

你说的曾经没有我的故事 提交于 2020-08-24 12:02:00
问题 I'm doing const last = await page.$('.item:last-child') Now I'd love to get the preceding element based on last. ie const prev = last.$.prev() Any thoughts on how to do this? Thanks! 回答1: You should use previousElementSibling inside evaluateHandle, like this: const prev = await page.evaluateHandle(el => el.previousElementSibling, last); Here is full example: const puppeteer = require('puppeteer'); const html = ` <html> <head></head> <body> <div> <div class="item">item 1</div> <div class="item

How to use puppeteer-core with electron?

左心房为你撑大大i 提交于 2020-08-24 03:12:18
问题 I got this code from another Stackoverflow Question: import electron from "electron"; import puppeteer from "puppeteer-core"; const delay = (ms: number) => new Promise(resolve => { setTimeout(() => { resolve(); }, ms); }); (async () => { try { const app = await puppeteer.launch({ executablePath: electron, args: ["."], headless: false, }); const pages = await app.pages(); const [page] = pages; await page.setViewport({ width: 1200, height: 700 }); await delay(5000); const image = await page

Puppeteer: How do I download a file using chrome headless browser api?

旧街凉风 提交于 2020-08-23 03:59:16
问题 Using Puppeteer, how do I get the headless chrome browser to download a file (or make additional http requests and save the response)? 回答1: You could make a simple request through the window, it should work. npm request As soon as it returns the promise with your response, you could write an express Save function, and store the response. It seems that puppeteer it has this implementation. See here: How to make a request with puppeteer. Have a look over this: Emitted when a page issues a

基于Apify+node+react/vue搭建一个有点意思的爬虫平台

ε祈祈猫儿з 提交于 2020-08-16 19:35:55
前言 熟悉我的朋友可能会知道,我一向是不写热点的。为什么不写呢?是因为我不关注热点吗?其实也不是。有些事件我还是很关注的,也确实有不少想法和观点。 但我一直奉行一个原则,就是: 要做有生命力的内容 。 本文介绍的内容来自于笔者之前负责研发的 爬虫管理平台 , 专门抽象出了一个相对独立的功能模块为大家讲解如何使用 nodejs 开发专属于自己的爬虫平台.文章涵盖的知识点比较多,包含 nodejs , 爬虫框架 , 父子进程及其通信 , react 和 umi 等知识, 笔者会以尽可能简单的语言向大家一一介绍. 你将收获 Apify 框架介绍和基本使用 如何创建 父子进程 以及 父子进程通信 使用 javascript 手动实现控制爬虫最大并发数 截取整个网页图片的实现方案 nodejs 第三方库和模块的使用 使用 umi3 + antd4.0 搭建爬虫前台界面 平台预览 上图所示的就是我们要实现的爬虫平台, 我们可以输入指定网址来抓取该网站下的数据,并生成整个网页的快照.在抓取完之后我们可以下载数据和图片.网页右边是用户抓取的记录,方便二次利用或者备份. 正文 在开始文章之前,我们有必要了解爬虫的一些应用. 我们一般了解的爬虫, 多用来爬取网页数据, 捕获请求信息, 网页截图等,如下图: 当然爬虫的应用远远不止如此,我们还可以利用爬虫库做 自动化测试 , 服务端渲染 ,

CukeTest+Puppeteer的Web自动化测试

那年仲夏 提交于 2020-08-16 07:34:07
测试页面以百度首页为例,我们用CukeTest+Puppeteer编写功能测试Demo,将上篇讲的相关知识点结合起来练手。 CukeTest官方文档: http://www.cuketest.com/zh-cn/ Puppeteer官方文档: https://zhaoqize.github.io/puppeteer-api-zh_CN/ 一、实例1 功能测试:参数化形式打开多个网页 1、打开CukeTest我们来新建一个空项目,安装Node和Puppeteer,注意(两者版本兼容问题),上文中已提到过的。 2、编辑剧本相关参数 3、编写剧本对应的脚本 4、运行 如下图 剧本的文本如下 # language: zh-CN 功能: 百度首页 打开百度首页 @openPage 场景大纲: 页面打开 假如打开百度首页 "<param1>" @pageOne 例子: | param1 | | https://www.baidu.com/ | | https://www.runoob.com/ | @pageTwo 例子: | param1 | | https://www.csdn.net/ | | https://www.cnblogs.com/ | @baiduSearch 场景: 百度首页搜索 打开百度首页,搜索 'puppeteer',百度查询并截图保存结果 假如打开百度首页