scrape

How would I scrape this JSON info using PHP and MySQL?

不羁的心 提交于 2019-12-12 22:42:31
问题 Here's the info I'm trying to break up into a database. I'm going to be using this only for my own use to analyse statistics and all that. I have been manually doing it with Excel but I'd like to save myself some work in future. URL IS: http://fantasy.premierleague.com/web/api/elements/537/ Any idea how to scrape that info or easily convert it to excel format? I know a bit of php and mysql, but nothing about JSON and very little about scraping (I tried messing with SIMPLE_HTML_DOM). 回答1: You

Get data between two tags in Python

好久不见. 提交于 2019-12-12 08:17:21
问题 <h3> <a href="article.jsp?tp=&arnumber=16"> Granular computing based <span class="snippet">data</span> <span class="snippet">mining</span> in the views of rough set and fuzzy set </a> </h3> Using Python I want to get the values from the anchor tag which should be Granular computing based data mining in the views of rough set and fuzzy set I tried using lxml parser = etree.HTMLParser() tree = etree.parse(StringIO.StringIO(html), parser) xpath1 = "//h3/a/child::text() | //h3/a/span/child::text(

Scrapy Crawl all websites in start_url even if redirect

南笙酒味 提交于 2019-12-12 03:08:15
问题 I am trying to crawl a long list of websites. Some of the websites in the start_url list redirect (301). I want scrapy to crawl the redirected websites from start_url list as if they were also on the allowed_domain list (which they are not). For example, example.com was on my start_url list and allowed domain list and example.com redirects to foo.com. I want to crawl foo.com. DEBUG: Redirecting (301) to <GET http://www.foo.com/> from <GET http://www.example.com> I tried dynamically adding

UDP Tracker Scraping 1 script working other Not

别说谁变了你拦得住时间么 提交于 2019-12-12 01:31:21
问题 While using this script my tracker only update seeds & leechers from http tracker only 1st Tracker of my torrent. print("<tr><td class='desc'><b>" .T_("Torrent Stats"). ": </b></td><td valign='top' class='lista'>"); $seeders1 = $leechers1 = $downloaded1 = null; $tres = SQL_Query_exec("SELECT url FROM announce WHERE torrent=$id"); while ($trow = mysql_fetch_assoc($tres)) { $ann = $trow["url"]; $tracker = explode("/", $ann); $path = array_pop($tracker); $oldpath = $path; $path = preg_replace("/

how to scrape multiple pages from one site

一个人想着一个人 提交于 2019-12-11 17:17:45
问题 I want to scrap multiple pages from one site.the pattern like this: https://www.example.com/S1-3-1.html https://www.example.com/S1-3-2.html https://www.example.com/S1-3-3.html https://www.example.com/S1-3-4.html https://www.example.com/S1-3-5.html. I tried three method to scrape all of these pages once, but every method only scrape the first page. I show the code below, and anyone can check and tell me what is the problem will be highly appreciated. ===============method 1====================

VBA scrape src instead of href

半世苍凉 提交于 2019-12-11 17:03:21
问题 I am using the code below code but it brings the value of 'src' instead of 'href' for some reason. Anyone can help please? Sub bringfox(txt As String) Dim oHtml As HTMLDocument Dim oElement As Object Set oHtml = New HTMLDocument maintext2 = "https://www.jjfox.co.uk/cigars/show/all.html" With CreateObject("WINHTTP.WinHTTPRequest.5.1") .Open "GET", maintext2 & gr, False .send oHtml.body.innerHTML = .responseText End With counter = cnt 'oElement(i).Children(0).getAttribute ("href") Set oElement

How to scrape data using Ruby which is generated by a Javascript function?

人盡茶涼 提交于 2019-12-11 16:31:57
问题 I am trying to scrape the data url link from the latest date (first row of the table) from this page. But it seems like the content of the table is generated by a Javascript function. I tried using Nokogiri to get it but in vain as nokogiri can not scrape Javascript. Then, I tried to get the script part only using Nokogiri by using: url = "http://www.sgx.com/wps/portal/sgxweb/home/marketinfo/historical_data/derivatives/daily_data" doc = Nokogiri::HTML(open(url)) js = doc.css("script").text

Scrape a particular area of site content With a Secure Login

这一生的挚爱 提交于 2019-12-11 16:24:25
问题 I am trying to scrape some particular text of a website which is login secured here is the tutorial on this using curl http://www.digeratimarketing.co.uk/2008/12/16/curl-page-scraping-script/ But I am unable to implement this into my curl codes here is my curl script $url = "http://aftabcurrency.com/login_script.php"; $ch = curl_init(); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_URL, $url); $cookie = 'cookies.txt'; $timeout = 30; curl_setopt ($ch, CURLOPT

How to scrape all possible results from a search bar of a website

ぃ、小莉子 提交于 2019-12-11 15:29:48
问题 This is my first web scraping task. I have been tasked with scraping this website It is a site that contains the names of lawyers in Denmark. My difficulty is that I can only retrieve names based on the particular name query i put in the search bar. Is there an online web tool I can use to scrape all the names that the website contains? I have used tools like Import.io with no success so far. I am super confused on how all of this works. 回答1: Please scroll down to UPDATE 2 The website

ElementNotVisibleException: Message: element not visible - Python3 Selenium

廉价感情. 提交于 2019-12-11 15:16:53
问题 I have been tasked with writing a parser to click a href link, that looks like a button, on a website and I am having some issues. Here's the html: https://pastebin.com/HDKLXpdJ Here's the source html: https://pastebin.com/PgT91kJs Python code: browser = webdriver.Chrome() ... try: element = WebDriverWait(browser, 20).until( EC.presence_of_element_located((By.ID, "reply-panel-reveal-btn"))) finally: elem = browser.find_element_by_xpath("//A[@id='reply-panel-reveal-btn']").click() I am getting