web-scraping | 易学教程

How to select a country from https://www.aliexpress.com/ ship to drowdown menu using Selenium and Python

阅读更多关于 How to select a country from https://www.aliexpress.com/ ship to drowdown menu using Selenium and Python

问题 On the website https://www.aliexpress.com, I need to change the country from the dropdown menu using selenium <span class="ship-to"> I can't find how I click on the country value using selenium 回答1: From the Ship to drop-down-menu to select the country as Afghanistan you have to induce WebDriverWait for the element_to_be_clickable() and you can use the following xpath based Locator Strategies: Code Block: driver.get("https://www.aliexpress.com/") WebDriverWait(driver, 20).until(EC.element_to

How to select a country from https://www.aliexpress.com/ ship to drowdown menu using Selenium and Python

阅读更多关于 How to select a country from https://www.aliexpress.com/ ship to drowdown menu using Selenium and Python

Scraping Wikipedia information (table)

阅读更多关于 Scraping Wikipedia information (table)

问题 I would need to scrape information regarding Elenco dei comuni per regione on Wikipedia. I would like to create an array that can allow me to associate each comune to the corresponding region, i.e. something like this: 'Abbateggio': 'Pescara' -> Abruzzo I tried to get information using BeautifulSoup and requests as follows: from bs4 import BeautifulSoup as bs import requests with requests.Session() as s: # use session object for efficiency of tcp re-use s.headers = {'User-Agent': 'Mozilla/5.0

Python - Scrape movies titles with Splash & BS4

阅读更多关于 Python - Scrape movies titles with Splash & BS4

问题 I try to create my first script with Python. I'm using Splash and BS4. I followed this tutorial from John Watson Rooney (but with my own target) : How I Scrape JAVASCRIPT websites with Python My goal is to scrape this website survey : Best movies of 2020 Here's my problem : It renders multiple times the same titles but with up to 6 duplicates in the list without any logical order. Sometimes it renders less than 100 lines, sometimes more? What I want : Get the 100 titles, by order Export them

All elements from html not being extracted by Requests and BeautifulSoup in Python

阅读更多关于 All elements from html not being extracted by Requests and BeautifulSoup in Python

问题 I am trying to scrape odds from a site that displays current odds from different agencies for an assignment on the effects of market competition. I am using Requests and BeautifulSoup to extract the relevant data. However after using: import requests from bs4 import BeautifulSoup url = "https://www.bestodds.com.au/odds/cricket/ICC-World-Twenty20/Sri-Lanka-v-Afghanistan_71992/" r=requests.get(url) Print(r.text) It does not print any odds, yet if I inspect the element on the page I can see them

All elements from html not being extracted by Requests and BeautifulSoup in Python

阅读更多关于 All elements from html not being extracted by Requests and BeautifulSoup in Python

Accessing the contents on links provided on a webpage while webscrapping

阅读更多关于 Accessing the contents on links provided on a webpage while webscrapping

问题 This is a followup question of my previous question. I am trying to access the contents of a webpage. I could search for contents on the webpage. However, I am not sure how to access the contents in links given on the webpage. For instance, the first line of the search result for id 1.1.1.1 is 36EUL/ADL_7 1.1.1.1 spectrophotometry .... C ... . The secondary id 36EUL/ADL_7 , in the first line, has another link that opens when clicked. I am not sure how to access the contents of the search

Python3 Scraping all informations of one page

阅读更多关于 Python3 Scraping all informations of one page

问题 My Spider: import scrapy class LinkSpider(scrapy.Spider): name = "page" start_urls = [ 'https://www.topart-online.com/de/Blattzweige-Blatt-und-Bluetenzweige/l-KAT282?seg=1' ] def parse(self, response): yield{ 'ItemSKU': response.xpath('//span[@class="sn_p01_pno"]/text()').getall(), 'title': response.xpath('//div[@class="sn_p01_desc h4 col-12 pl-0 pl-sm-3 pull-left"]/text()').getall(), 'ItemEAN': response.xpath('//div[@class="productean"]/text()').getall(), 'Delivery_Status': response.xpath('/

VBA-Web Scraping- Can't acces table web page

阅读更多关于 VBA-Web Scraping- Can't acces table web page

问题 I tried to scrape the data prices table in this web https://www.energylive.cloud/ , like I did in other webs, but I can't (I don't have much experience scraping). Thanks in advance!!!: Sub ej() Dim XMLrequest As New MSXML2.XMLHTTP60 Dim HTMLdoc As New MSHTML.HTMLDocument Dim HTMLtable As MSHTML.IHTMLElement 'Dim HTMLi As MSHTML.IHTMLElementCollection Dim url As String url = "https://www.energylive.cloud/" XMLrequest.Open "GET", url, False XMLrequest.send If XMLrequest.Status <> 200 Then

Print output in as a list

阅读更多关于 Print output in as a list

问题 The following code runs fine. It gathers information per listing on LinkedIn. (Account info given and free to use as it is a test account) However, the output joins the data instead of each field having its own field. I want the ouput printed in Excel with each field in the dictionary (Name, Company, Location) in its own column, with the outputs being in their own cell. See attached for an example of expected output- I have tried beautifulSoup but dont think that works. import time import