问题
I'm trying to download some data from a webpage that is dynamically generated, so using wget doesn't work. The page is http://gaceta.diputados.gob.mx/SIL/Legislaturas/Listados.html I want to download the list shown for each of the options that can be selected in the field "Legislatura" once downloaded I can process the data in ruby.
Just wanted to know what is the best way to download this, and if posible to select each of the options and download.
回答1:
You can use the Web Inspector in Safari or Chrome or the Firebug extension in Firefox to look at how the data is loaded. The page is doing an AJAX POST request to a Perl script for this website, and the data is return as XML.
I would use cURL to grab the data.
回答2:
You could use http://watir.com/ or webrat to simulate what you would do to view the data then use Nokogiri to parse the HTML.
来源:https://stackoverflow.com/questions/5852220/how-to-download-dynamic-generated-content-from-webpage