web-scraping | 易学教程

Cheerio, axios, reactjs to web scrape a table off a webpage returning empty list

阅读更多关于 Cheerio, axios, reactjs to web scrape a table off a webpage returning empty list

问题 Trying to scrape this table off this website: https://www.investing.com/commodities/real-time-futures But for some reason when I try to get the data, I keep getting an empty list. This is what I'm doing to get the data and parse it: componentDidMount() { axios.get(`https://www.investing.com/commodities/real-time-futures`) .then(response => { if(response.status === 200) { const html = response.data; const $ = cheerio.load(html); let data = []; $('#cross_rate_1 tr').each((i, elem) => { data

Press on link inside table with vba on webpage

阅读更多关于 Press on link inside table with vba on webpage

问题 I figured out how to navigate to the webpage and make it search for my specific input and so fourth. However when i get the list with the result i am not able to make vba click on it. I have tried different variations of: objIE.document.getElementById("isc_ListGrid_1_body$28s").Click But I'm unable to figure how to press the specifc cell (marked in blue in pic): https://imgur.com/inH25WQ I'm not getting any error's, but its not click on it either. 回答1: You want the id of the table, not the id

Scrape table from website

阅读更多关于 Scrape table from website

问题 I have the following code which navigates to a website, enters in two names (used here for example, the real names will pull a list of 10 names from a spreadsheet), then searches for their records. I'm trying to pull the resulting table that is generated into a spreadsheet. I've tried it a few ways but can't seem to get it to work. Looking for code to go under the comment "Scrape Table Here". I know this involves accessing the site's HTML which I can also do but I'm not familiar enough with

How to replace or update “Style” attribute value by using VBA

阅读更多关于 How to replace or update “Style” attribute value by using VBA

问题 I am feeding Data on website using VBA. I am want to change/replace or update the value of "Style" attribute on that website. HTML <div class="timeline-row-item style="left: 556px;"></div> I want to change the value from style="left: 556px;" to style="left: 300px;" Sub test1() Dim IE As New SHDocVw.InternetExplorer IE.document.querySelectorAll("div[class*= timeline-row-iteml]").setAttribute ("style", left: 300px;") How can i do this on Excel VBA. Thank You 回答1: querySelectorAll returns a

Web Scrapping with Python and newspaper3k lib does not return data

阅读更多关于 Web Scrapping with Python and newspaper3k lib does not return data

问题 I have installed Newspapper3k Lib on my Mac with sudo pip3 install Newspapper3k . Im using Python 3. I want to return data thats supported at Article object, and that is url, date, title, text, summarisation and keywords but I do not get any data: import newspaper from newspaper import Article #creating website for scraping cnn_paper = newspaper.build('https://www.euronews.com/', memoize_articles=False) #I have tried for https://www.euronews.com/, https://edition.cnn.com/, https://www.bbc.com

Scrapy use item and save data in a json file

阅读更多关于 Scrapy use item and save data in a json file

问题 I want to use scrapy item and manipulate data and saving all in json file (using json file like a db). # Spider Class class Spider(scrapy.Spider): name = 'productpage' start_urls = ['https://www.productpage.com'] def parse(self, response): for product in response.css('article'): link = product.css('a::attr(href)').get() id = link.split('/')[-1] title = product.css('a > span::attr(content)').get() product = Product(self.name, id, title, price,'', link) yield scrapy.Request('{}.json'.format

Scraping text from unordered lists using beautiful soup and python

阅读更多关于 Scraping text from unordered lists using beautiful soup and python

问题 I am using python and beautiful soup to scrape information from a web page. I am interested in the following section of source code: <ul class="breadcrumb"> <li><a href="/" title="Return to the home page">Home</a><span class="sprite icon-delimiter"></span></li> <li><a href="/VehicleSearch/Search/Mini" title="View our range of Mini vehicles">Mini</a><span class="sprite icon-delimiter"></span></li> <li class="active"><a href="/VehicleSearch/Search/Mini/Countryman" title="View our range of Mini

log_count/ERROR while scraping site with Scrapy

阅读更多关于 log_count/ERROR while scraping site with Scrapy

问题 I am getting the following log_count/ERROR while scraping a site with Scrapy. I can see that it has made 43 requests and got 43 responses. Everything looks fine. Then what the error for?: 2018-03-19 00:31:30 [scrapy.statscollectors] INFO: Dumping Scrapy stats: {'downloader/request_bytes': 18455, 'downloader/request_count': 43, 'downloader/request_method_count/GET': 43, 'downloader/response_bytes': 349500, 'downloader/response_count': 43, 'downloader/response_status_count/200': 38, 'downloader

Scrapy repeating rows

阅读更多关于 Scrapy repeating rows

问题 I'm trying to scrape through this site https://www.tahko.com/fi/menovinkit/?ql=tapahtumat. In particular, I'm trying to scrape through the 3 tables on the site. I've managed this with tables = response.xpath('//*[@class="table table-stripefd"]') Then I'd like to get each of the rows for the table, which I did with rows = tables.xpath('//tr') The problem here is, that after scraping and printing out some of the data I noticed that there are multiple entries for some rows. For example, the

Is this site not suited for web scraping using beautifulsoup?

阅读更多关于 Is this site not suited for web scraping using beautifulsoup?

问题 I try to use beautifulsoup to get the odds for each match on the following site: https://danskespil.dk/oddset/sports/category/990/counter-strike-go/matches The goal is to end up with some kind of text file containing the following: Match1, Team1, Odds for team1 winning, Team2, Odds for team2 winning Match2, Team1, Odds for team1 winning, Team2, Odds for team2 winning and so on... I am new to beautifulsoup so things already go wrong at a very elementary level. My approach is to "walk" through