问题
I'm trying to get this website into my php variable:
https://www.warcraftlogs.com/rankings/server/393/latest/#class=Druid&spec=Feral
You can see the website is only starting to load the real contents of the website once the page has finished loading for the first time.
file_get_contents("https://www.warcraftlogs.com/rankings/server/393/latest/#class=Druid&spec=Feral");
returns only the stock empty website without the actual contents from the tables that loaded in the second step.
Is there a way to make file_get_contents wait for the site to load?
回答1:
In order to understand what's happening on the site, try opening your browser's network inspector. You'll see the page itself load, and then you'll see various other resources load, like CSS files, JS files, images, and some more pages.
One of those other pages is this: https://www.warcraftlogs.com/rankings/table/dps/6/0/5/20/1/Druid/Feral/0/393/?search=&page=1.
It looks like the main site issues an AJAX request to fetch the additional data from that URL. Note that there's no way for file_get_contents() to get everything all at once, since file_get_contents() will not parse the website or evaluate any JS (and JS is what triggers the AJAX request). The solution is simple - instead of using file_get_contents() to grab the main site, use it to grab that secondary page with the data.
If you're trying to grab this URL, you'll have to dig deeper. If you open the main page, you'll find a piece of JS embedded on the page that looks like this:
function loadTable()
{
var loadString = '/rankings/table/' + filterMetric + '/' + zoneID + '/' + filterBoss + '/' + filterDifficulty + '/' + filterSize + '/' + filterRegion + '/' + filterClass + '/' + filterSpec + '/' + filterBracket + '/' + filterServer + '/' + '?' + "search=" + filterSearch + "&page=" + filterPage
$("#table-container").load(loadString, tableLoaded)
}
Notice how it's dynamically creating a string with the desired parameters. Then it calls $.fn.load(), which triggers the AJAX request to the URL.
回答2:
Data you want to see are on different url:
https://www.warcraftlogs.com/rankings/table/dps/6/0/5/20/1/Druid/Feral/0/393/?search=&page=1
回答3:
that site use ajax, you can find the ajax load and get it. the real stock url is:
file_get_contents("https://www.warcraftlogs.com/rankings/table/dps/6/0/5/20/1/Druid/Feral/0/393/?search=&page=1");
回答4:
You can load data from this url:
https://www.warcraftlogs.com/rankings/table/dps/6/0/5/20/1/Druid/Feral/0/393/?search=&page=1
来源:https://stackoverflow.com/questions/28130130/make-file-get-contents-wait-for-website-to-load-completely