web-scraping

How to iterate over divs in Scrapy?

流过昼夜 提交于 2021-02-07 20:57:24
问题 It is propably very trivial question but I am new to Scrapy. I've tried to find solution for my problem but I just can't see what is wrong with this code. My goal is to scrap all of the opera shows from given website. Data for every show is inside one div with class "row-fluid row-performance ". I am trying to iterate over them to retrieve it but it doesn't work. It gives me content of the first div in each iteration(I am getting 19x times the same show, instead of different items). Thanks

webscraping data tables and data from a web page

我与影子孤独终老i 提交于 2021-02-07 10:53:49
问题 I am trying to webscrape real time streaming data tables and data from a web page I tried: library(XML) webpage <- "http://www.investing.com/indices/us-30" tables <- readHTMLTable(webpage ) n.rows <- unlist(lapply(tables, function(t) dim(t)[1])) tables n.rows but I get an error. Thank you for your help. 回答1: I'm actually proud of myself for getting this to work (I have been trying to get my head around these kinds of things for a long time now....) library(rvest) url4 <- "http://www.investing

Getting embbeded Facebook comments from site

孤人 提交于 2021-02-07 10:47:15
问题 I would like to retrieve the embedded Facebook comments from the page web: (http://www.example.com/sub_page_wFBcomments) What I know I can use the Facebook graph API for retrieving facebook comments directly from facebook.com. The same is not true when the comments are embedded in the web site of the facebook page's owner. What I've tried When using the graph API like this: https://graph.facebook.com/v2.7/[apikey]/?key=value&access_token=[MyToken] { "link": "http://www.example.con/", "name":

Web-scrapeing a table to a list

感情迁移 提交于 2021-02-07 10:39:13
问题 I'm trying to extract a table from a webpage. I have managed to get all the data in the table into a list. However all the table data is being put into one list element. I need assistance getting the 'clean' data (i.e. the strings, without all the HTML packaging) from the rows of the table into their own list elements. So instead of... list = [<tr> <th><a href="/7.62x25mm_TT_AKBS" title="7.62x25mm TT AKBS"><img alt="TTAKBS.png" decoding="async" height="64" src="https://static.wikia.nocookie

Scrape website's Power BI dashboard using R

安稳与你 提交于 2021-02-07 10:21:10
问题 I have been trying to scrape my local government's Power BI dashboard using R but it seems like it might be impossible. I've read from the Microsoft site that it is not possible to scrable Power BI dashboards but I am going through several forums showing that it is possible, however I am going through a loop I am trying to scrape the Zip Code tab data from this dashboard: https://app.powerbigov.us/view?r

Scrape website's Power BI dashboard using R

让人想犯罪 __ 提交于 2021-02-07 10:20:31
问题 I have been trying to scrape my local government's Power BI dashboard using R but it seems like it might be impossible. I've read from the Microsoft site that it is not possible to scrable Power BI dashboards but I am going through several forums showing that it is possible, however I am going through a loop I am trying to scrape the Zip Code tab data from this dashboard: https://app.powerbigov.us/view?r

rvest Webscraping in R with form inputs

眉间皱痕 提交于 2021-02-07 10:10:57
问题 I can't get my head around this problem in R and I would really appreciate if you could leave a piece of advice for me here. I am trying to scrape historical bond yield data from https://www.investing.com/rates-bonds/spain-5-year-bond-yield-historical-data for personal use only (of course). The solution provided here works really well but only goes as far as to scrape the first 24 time stamps of daily data: webscraping data tables and data from a web page What I am trying to achieve is to

How to get ETF Financial information (e.g. NAV) from Yahoo (with Quantmod)?

百般思念 提交于 2021-02-07 09:51:22
问题 I know that I can use the quantmod package to get stock financial information easily from yahoo. For example, if I want to get the Volume, P/E ratio and Dividend Yield: > library(quantmod) > AAPL <- getSymbols("AAPL") Warning message: In download.file(paste(yahoo.URL, "s=", Symbols.name, "&a=", from.m, : downloaded length 167808 != reported length 200 > what_metrics <- yahooQF(c("Name", + "Volume", + "P/E Ratio", + "Dividend Yield" + + )) > > getQuote(AAPL, what=what_metrics) Trade Time Name

How to get ETF Financial information (e.g. NAV) from Yahoo (with Quantmod)?

假装没事ソ 提交于 2021-02-07 09:50:15
问题 I know that I can use the quantmod package to get stock financial information easily from yahoo. For example, if I want to get the Volume, P/E ratio and Dividend Yield: > library(quantmod) > AAPL <- getSymbols("AAPL") Warning message: In download.file(paste(yahoo.URL, "s=", Symbols.name, "&a=", from.m, : downloaded length 167808 != reported length 200 > what_metrics <- yahooQF(c("Name", + "Volume", + "P/E Ratio", + "Dividend Yield" + + )) > > getQuote(AAPL, what=what_metrics) Trade Time Name

Anyway to scrape a link that redirects?

≯℡__Kan透↙ 提交于 2021-02-07 09:47:58
问题 Is there anyway that I can make python click a link such as a bit.ly link and then scrape the resulting link? When I am scraping a certain page, the only link I can scrape is a link that redirects, where it redirects to is where the information I need is located. 回答1: There are 3 types of redirections HTTP - as information in response headers (with code 301, 302, 3xx) HTML - as tag <meta> in HTML (wikipedia: Meta refresh) JavaScript - as code like window.location = new_url requests execute