web-scraping

Automation of iTunes connect VBA

自古美人都是妖i 提交于 2021-02-02 09:57:24
问题 I am trying to automate a report through VBA. I have worked in VBA but not able to login in iTunes website through codes. Someone told me that it is written in IFrame, but i have no idea. Even i am not able to put my username in input box of login page. https://itunesconnect.apple.com/login Dim HTMLdoc As HTMLDocument Dim MyBrowser As InternetExplorer Sub check() Dim MyHTML_element As IHTMLElement Dim MyURL As String MyURL = "https://itunesconnect.apple.com/login" Set MyBrowser = New

StaleElementReferenceException even after adding the wait while collecting the data from the wikipedia using web-scraping

心已入冬 提交于 2021-02-02 09:56:26
问题 I am a newbie to the web-scraping. Pardon my silly mistakes if there are any. I have been working on a project in which I need a list of movies as my data. I am trying to collect the data from the wikipedia using web-scraping. Following is my code for the same: def MoviesList(years, driver): for year in years: driver.implicitly_wait(150) year.click() table = driver.find_element_by_xpath('/html/body/div[3]/div[3]/div[5]/div[1]/table[2]/tbody') movies = table.find_elements_by_xpath('tr/td[1]/i

WebScraping JavaScript-Rendered Content using Selenium in Python

僤鯓⒐⒋嵵緔 提交于 2021-02-02 02:09:03
问题 I am very new to web scraping and have been trying to use Selenium's functions to simulate a browser accessing the Texas public contracting webpage and then download embedded PDFs. The website is this: http://www.txsmartbuy.com/sp. So far, I've successfully used Selenium to select an option in one of the dropdown menus "Agency Name" and to click the search button. I've listed my Python code below. import os os.chdir("/Users/fsouza/Desktop") #Setting up directory from bs4 import BeautifulSoup

WebScraping JavaScript-Rendered Content using Selenium in Python

浪尽此生 提交于 2021-02-02 02:08:45
问题 I am very new to web scraping and have been trying to use Selenium's functions to simulate a browser accessing the Texas public contracting webpage and then download embedded PDFs. The website is this: http://www.txsmartbuy.com/sp. So far, I've successfully used Selenium to select an option in one of the dropdown menus "Agency Name" and to click the search button. I've listed my Python code below. import os os.chdir("/Users/fsouza/Desktop") #Setting up directory from bs4 import BeautifulSoup

How to unit test a web scraping service php unit

こ雲淡風輕ζ 提交于 2021-01-29 22:56:28
问题 I am currently developing a project in PHP + Laravel that needs to scrape data from two different websites. I am using the Goutte Scraping Library. I have 10 integration tests, where I use the Crawler object that Goutte's Client provide in order to get the specific data I want to scrape from each website. The tests work just fine (I even used infection library for mutant testing)... But the thing is that I thik there could be a way to unit test all the functions (therefore, the tests would

How to unit test a web scraping service php unit

廉价感情. 提交于 2021-01-29 22:35:01
问题 I am currently developing a project in PHP + Laravel that needs to scrape data from two different websites. I am using the Goutte Scraping Library. I have 10 integration tests, where I use the Crawler object that Goutte's Client provide in order to get the specific data I want to scrape from each website. The tests work just fine (I even used infection library for mutant testing)... But the thing is that I thik there could be a way to unit test all the functions (therefore, the tests would

How to scrape links from a webpage using javascript?

梦想与她 提交于 2021-01-29 22:17:25
问题 I'm looking to scrape the links of post shown on facebook feed. I noticed that post link has two things in common it has https://www.facebook.com/username/posts/1234567890 https://www.facebook.com/ and /posts/ is always there. I used this code to get all links on the page but I don't know how to only grab links with https://www.facebook.com/ and /posts/ in this. var links = document.querySelectorAll("a[href^='https://www.facebook.com']"); for(var i = 0; i< links.length; i++){ console.log

How to scrape links from a webpage using javascript?

霸气de小男生 提交于 2021-01-29 21:30:35
问题 I'm looking to scrape the links of post shown on facebook feed. I noticed that post link has two things in common it has https://www.facebook.com/username/posts/1234567890 https://www.facebook.com/ and /posts/ is always there. I used this code to get all links on the page but I don't know how to only grab links with https://www.facebook.com/ and /posts/ in this. var links = document.querySelectorAll("a[href^='https://www.facebook.com']"); for(var i = 0; i< links.length; i++){ console.log

I cannot autologin to pastebin using requests + BeautifulSoup

与世无争的帅哥 提交于 2021-01-29 20:53:39
问题 I am trying to auto-login to pastebin account using python, but im failing and i don't know why. I copied the request headers exactly and double checked... but still i am greeted with 400 HTTP code. Can somebody help me? This is my code: import requests from bs4 import BeautifulSoup import subprocess import os import sys from requests import Session # the actual program page = requests.get("https://pastebin.com/99qQTecB") parse = BeautifulSoup(page.content, 'html.parser') string = parse.find(

if or try loop for an element in a page selenium

﹥>﹥吖頭↗ 提交于 2021-01-29 19:27:54
问题 I am trying to scrape agents data here. I am able to get the links from the first page. I am using numbered loops because I know the total number of pages. I tried to run this as long as the "next" page option is there. I tried both "try" and "if not" but wasn't able to figure it out. Any help is welcome. Here is the code. from selenium import webdriver import time from selenium.common.exceptions import ElementNotVisibleException, NoSuchElementException from selenium.webdriver.common.by