问题
I am trying to scrape a website (believe it is in JavaScript) using a simple PHP script. I am a beginner so any help would be greatly appreciated. The URL of the webpage is:
http://www.indiainfoline.com/Markets/Company/Fundamentals/Balance-Sheet/Yes-Bank-Ltd/532648
So here for example I would like to pass the name of company (Yes-Bank-Ltd) and code (532648) in get_file_contents. Not sure on how to do it so can somebody please help.
Thanks, Nidhi
回答1:
Why aren't you just not append the string of the company and code in the url. Here is an idea that you fill up an array of company and code (need to be the same size) and then you loop them to scrape the data you want.
for($i=0;$i<count($listOfCie);$i++)
{
$cie = $listOfCie[$i];
$code = $listOfCode[$i];
$urlToScrape = "http://www.indiainfoline.com/Markets/Company/Fundamentals/Balance-Sheet/" . $cie . "/" . $code
//... = get_file_contents($urlToScrape....
}
回答2:
Use the data.html table in YQL! http://developer.yahoo.com/yql/console
回答3:
The simplest way to scrape a site in PHP is to use curl
(http://php.net/manual/en/book.curl.php)
For some examples look at http://php.net/manual/en/curl.examples-basic.php or google :)
If the website relies on javascript though it's going to be difficult to get the data you want. You might look at a "headless browser" like http://phantomjs.org/
来源:https://stackoverflow.com/questions/6654541/scrape-a-website-javascript-website-using-php