How to download dynamic generated content from webpage?

让人想犯罪 __ 提交于 2019-12-13 05:41:25

问题


I'm trying to download some data from a webpage that is dynamically generated, so using wget doesn't work. The page is http://gaceta.diputados.gob.mx/SIL/Legislaturas/Listados.html I want to download the list shown for each of the options that can be selected in the field "Legislatura" once downloaded I can process the data in ruby.

Just wanted to know what is the best way to download this, and if posible to select each of the options and download.


回答1:


You can use the Web Inspector in Safari or Chrome or the Firebug extension in Firefox to look at how the data is loaded. The page is doing an AJAX POST request to a Perl script for this website, and the data is return as XML.

I would use cURL to grab the data.




回答2:


You could use http://watir.com/ or webrat to simulate what you would do to view the data then use Nokogiri to parse the HTML.



来源:https://stackoverflow.com/questions/5852220/how-to-download-dynamic-generated-content-from-webpage

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!