screen scraping technique using php

前端 未结 6 1431
你的背包
你的背包 2020-12-20 07:10

How to screen scrape a particular website. I need to log in to a website and then scrape the inner information. How could this be done?

Please guide me.

相关标签:
6条回答
  • 2020-12-20 07:19

    You could also check out http://php.net/dom

    0 讨论(0)
  • 2020-12-20 07:22
    Zend_Http_Client and Zend_Dom_Query
    
    0 讨论(0)
  • 2020-12-20 07:22

    Curl, and once ure in, use QueryPath php library. (querypath.org) You can access dom elements just like in JQuery, via CSS selectors, there's method chaining...

    Way better than just using php's native xml functions.

    It also works as drupal extension, but I suppose you could implement it in any php project.

    0 讨论(0)
  • 2020-12-20 07:23

    You want to look at the curl functions - they will let you get a page from another website. You can use cookies or HTTP authentication to log in first then get the page you want, depending on the site you're logging in to.

    Once you have the page, you're probably best off using regular expressions to scrape the data you want.

    0 讨论(0)
  • 2020-12-20 07:35

    You might also want to take a look at BeautifulSoup which is a Python library which is supposed to be very good at making bad HTML parseable. It is aimed at things like screen scraping.

    How easy it would be to call from PHP I don't know though.

    0 讨论(0)
  • 2020-12-20 07:37

    You should look look at curl.

    0 讨论(0)
提交回复
热议问题