web-scraping

Unable to pass cookies between selenium and requests in order to do the scraping using the latter

这一生的挚爱 提交于 2020-12-07 06:37:30
问题 I've written a script in python in combination with selenium to log into a site and then transfer cookies from driver to requests so that I can go ahead using requests to do further activities. I used item = soup.select_one("div[class^='gravatar-wrapper-']").get("title") this line to check whether the script can fetch my username when everything is done. This is my try so far: import requests from bs4 import BeautifulSoup from selenium import webdriver from selenium.webdriver.common.keys

How do I get the html (with js script) of a page using JSOUP

给你一囗甜甜゛ 提交于 2020-12-06 15:07:08
问题 I want to get the html content of a page but am unable to because of the scripts that are in the HTML file. I'm trying to use Jsoup to extract the content. If it helps, this is the link to my issue. JSoup select form returns null Does anyone know how I can achieve this? Thanks. 来源: https://stackoverflow.com/questions/64971866/how-do-i-get-the-html-with-js-script-of-a-page-using-jsoup

How do I get the html (with js script) of a page using JSOUP

只谈情不闲聊 提交于 2020-12-06 15:06:58
问题 I want to get the html content of a page but am unable to because of the scripts that are in the HTML file. I'm trying to use Jsoup to extract the content. If it helps, this is the link to my issue. JSoup select form returns null Does anyone know how I can achieve this? Thanks. 来源: https://stackoverflow.com/questions/64971866/how-do-i-get-the-html-with-js-script-of-a-page-using-jsoup

Unable to make my script work asynchronously

孤街浪徒 提交于 2020-12-01 09:19:51
问题 I've written a script in vba to scrape different movie names and their genre from a torrent site. Although the name and genre are present in it's landing page, I created the script to parse the same going one layer deep (from their main pages). To be clearer, this is one of such page what I meant by main page. My script is parsing them flawlessly. However, my intention is to do the same making asynchronous requests . Currently the script is doing it's job synchronously (in blocking manner).

Unable to make my script work asynchronously

别来无恙 提交于 2020-12-01 09:15:35
问题 I've written a script in vba to scrape different movie names and their genre from a torrent site. Although the name and genre are present in it's landing page, I created the script to parse the same going one layer deep (from their main pages). To be clearer, this is one of such page what I meant by main page. My script is parsing them flawlessly. However, my intention is to do the same making asynchronous requests . Currently the script is doing it's job synchronously (in blocking manner).

How to scrape a Tableau dashboard in which data is only displayed in a plot after clicking in a map?

放肆的年华 提交于 2020-11-29 14:01:51
问题 I am trying to scrape data from this public Tableau dashboard. The ineterest is in the time series plotted data. If i click in a spcific state in the map, the time series changes to that specific state. Following this and this posts I got the results for the time series aggregated at the country-level (with the code provided below). But my interest is in a state-level data. import requests from bs4 import BeautifulSoup import json import re # get the second tableau link r = requests.get( f

How to scrape a Tableau dashboard in which data is only displayed in a plot after clicking in a map?

一个人想着一个人 提交于 2020-11-29 14:01:47
问题 I am trying to scrape data from this public Tableau dashboard. The ineterest is in the time series plotted data. If i click in a spcific state in the map, the time series changes to that specific state. Following this and this posts I got the results for the time series aggregated at the country-level (with the code provided below). But my interest is in a state-level data. import requests from bs4 import BeautifulSoup import json import re # get the second tableau link r = requests.get( f

How to scrape a Tableau dashboard in which data is only displayed in a plot after clicking in a map?

安稳与你 提交于 2020-11-29 13:58:30
问题 I am trying to scrape data from this public Tableau dashboard. The ineterest is in the time series plotted data. If i click in a spcific state in the map, the time series changes to that specific state. Following this and this posts I got the results for the time series aggregated at the country-level (with the code provided below). But my interest is in a state-level data. import requests from bs4 import BeautifulSoup import json import re # get the second tableau link r = requests.get( f

Using Apps Script to scrape javascript rendered web page

邮差的信 提交于 2020-11-28 03:45:30
问题 I am struggling to put a script together to handle the scraping of a javascript rendered web page through Apps Script. Found this How to scrape Javascript rendered websites using Javascript? here, but I don't know how to put this together. Such as load puppeteer. Any help would be appreciated. 回答1: You can try to scrape the initial HTML, since actually scraping the rendered HTML is extremely hard to do, you'd have to use a headless browser. There is this library: https://github.com

Using Apps Script to scrape javascript rendered web page

ぃ、小莉子 提交于 2020-11-28 03:37:01
问题 I am struggling to put a script together to handle the scraping of a javascript rendered web page through Apps Script. Found this How to scrape Javascript rendered websites using Javascript? here, but I don't know how to put this together. Such as load puppeteer. Any help would be appreciated. 回答1: You can try to scrape the initial HTML, since actually scraping the rendered HTML is extremely hard to do, you'd have to use a headless browser. There is this library: https://github.com