web-scraping

Cheerio, axios, reactjs to web scrape a table off a webpage returning empty list

非 Y 不嫁゛ 提交于 2021-01-29 08:53:50
问题 Trying to scrape this table off this website: https://www.investing.com/commodities/real-time-futures But for some reason when I try to get the data, I keep getting an empty list. This is what I'm doing to get the data and parse it: componentDidMount() { axios.get(`https://www.investing.com/commodities/real-time-futures`) .then(response => { if(response.status === 200) { const html = response.data; const $ = cheerio.load(html); let data = []; $('#cross_rate_1 tr').each((i, elem) => { data

Press on link inside table with vba on webpage

﹥>﹥吖頭↗ 提交于 2021-01-29 08:42:19
问题 I figured out how to navigate to the webpage and make it search for my specific input and so fourth. However when i get the list with the result i am not able to make vba click on it. I have tried different variations of: objIE.document.getElementById("isc_ListGrid_1_body$28s").Click But I'm unable to figure how to press the specifc cell (marked in blue in pic): https://imgur.com/inH25WQ I'm not getting any error's, but its not click on it either. 回答1: You want the id of the table, not the id

Scrape table from website

↘锁芯ラ 提交于 2021-01-29 08:39:36
问题 I have the following code which navigates to a website, enters in two names (used here for example, the real names will pull a list of 10 names from a spreadsheet), then searches for their records. I'm trying to pull the resulting table that is generated into a spreadsheet. I've tried it a few ways but can't seem to get it to work. Looking for code to go under the comment "Scrape Table Here". I know this involves accessing the site's HTML which I can also do but I'm not familiar enough with

How to replace or update “Style” attribute value by using VBA

旧巷老猫 提交于 2021-01-29 08:31:43
问题 I am feeding Data on website using VBA. I am want to change/replace or update the value of "Style" attribute on that website. HTML <div class="timeline-row-item style="left: 556px;"></div> I want to change the value from style="left: 556px;" to style="left: 300px;" Sub test1() Dim IE As New SHDocVw.InternetExplorer IE.document.querySelectorAll("div[class*= timeline-row-iteml]").setAttribute ("style", left: 300px;") How can i do this on Excel VBA. Thank You 回答1: querySelectorAll returns a

Web Scrapping with Python and newspaper3k lib does not return data

只谈情不闲聊 提交于 2021-01-29 08:13:08
问题 I have installed Newspapper3k Lib on my Mac with sudo pip3 install Newspapper3k . Im using Python 3. I want to return data thats supported at Article object, and that is url, date, title, text, summarisation and keywords but I do not get any data: import newspaper from newspaper import Article #creating website for scraping cnn_paper = newspaper.build('https://www.euronews.com/', memoize_articles=False) #I have tried for https://www.euronews.com/, https://edition.cnn.com/, https://www.bbc.com

Scrapy use item and save data in a json file

这一生的挚爱 提交于 2021-01-29 08:01:49
问题 I want to use scrapy item and manipulate data and saving all in json file (using json file like a db). # Spider Class class Spider(scrapy.Spider): name = 'productpage' start_urls = ['https://www.productpage.com'] def parse(self, response): for product in response.css('article'): link = product.css('a::attr(href)').get() id = link.split('/')[-1] title = product.css('a > span::attr(content)').get() product = Product(self.name, id, title, price,'', link) yield scrapy.Request('{}.json'.format

Scraping text from unordered lists using beautiful soup and python

允我心安 提交于 2021-01-29 07:22:42
问题 I am using python and beautiful soup to scrape information from a web page. I am interested in the following section of source code: <ul class="breadcrumb"> <li><a href="/" title="Return to the home page">Home</a><span class="sprite icon-delimiter"></span></li> <li><a href="/VehicleSearch/Search/Mini" title="View our range of Mini vehicles">Mini</a><span class="sprite icon-delimiter"></span></li> <li class="active"><a href="/VehicleSearch/Search/Mini/Countryman" title="View our range of Mini

log_count/ERROR while scraping site with Scrapy

本小妞迷上赌 提交于 2021-01-29 07:22:15
问题 I am getting the following log_count/ERROR while scraping a site with Scrapy. I can see that it has made 43 requests and got 43 responses. Everything looks fine. Then what the error for?: 2018-03-19 00:31:30 [scrapy.statscollectors] INFO: Dumping Scrapy stats: {'downloader/request_bytes': 18455, 'downloader/request_count': 43, 'downloader/request_method_count/GET': 43, 'downloader/response_bytes': 349500, 'downloader/response_count': 43, 'downloader/response_status_count/200': 38, 'downloader

Scrapy repeating rows

邮差的信 提交于 2021-01-29 07:21:08
问题 I'm trying to scrape through this site https://www.tahko.com/fi/menovinkit/?ql=tapahtumat. In particular, I'm trying to scrape through the 3 tables on the site. I've managed this with tables = response.xpath('//*[@class="table table-stripefd"]') Then I'd like to get each of the rows for the table, which I did with rows = tables.xpath('//tr') The problem here is, that after scraping and printing out some of the data I noticed that there are multiple entries for some rows. For example, the

Is this site not suited for web scraping using beautifulsoup?

醉酒当歌 提交于 2021-01-29 07:17:47
问题 I try to use beautifulsoup to get the odds for each match on the following site: https://danskespil.dk/oddset/sports/category/990/counter-strike-go/matches The goal is to end up with some kind of text file containing the following: Match1, Team1, Odds for team1 winning, Team2, Odds for team2 winning Match2, Team1, Odds for team1 winning, Team2, Odds for team2 winning and so on... I am new to beautifulsoup so things already go wrong at a very elementary level. My approach is to "walk" through