Scraping dynamic content in a website

后端 未结 4 1557
梦如初夏
梦如初夏 2020-11-28 13:18

I need to scrape news announcements from this website, Link. The announcements seem to be generated dynamically. They dont appear in the source. I usually use mechanize but

4条回答
  •  失恋的感觉
    2020-11-28 13:51

    In python you can use urllib and urllib2 to connect to a website and collect data. For example:

    from urllib2 import urlopen
    myUrl = "http://www.marketvectorsindices.com/#!News/List"
    inStream = urlopen(myUrl)
    instream.read(1024) # etc, in a while loop
    # all your fun page parsing code (perhaps: import from xml.dom.minidom import parse)
    

提交回复
热议问题