What is the most elegant way to do screen scraping in node.js?

后端 未结 3 1576
再見小時候
再見小時候 2020-12-24 15:25

I\'m in the process of hacking together a web app which uses extensive screen scraping in node.js. I feel like I\'m fighting against the current at every corner. There must

相关标签:
3条回答
  • 2020-12-24 15:28

    It turns out someone made a phantomjs module for node.js:

    https://github.com/sgentle/phantomjs-node

    While phantom is fairly heavy, it also supports SSL, cookies, and everything else a typical browser supports (since it is a webkit browser, after all).

    Give it a shot, it may be exactly what you are looking for.

    0 讨论(0)
  • 2020-12-24 15:32

    i actually have a scraper library now https://github.com/mikeal/spider it's quite nice, you can use jquery and routes.

    feedback is welcome :)

    0 讨论(0)
  • 2020-12-24 15:53

    You may want to check out https://github.com/mikeal/request from mikeal, I just spoke to him the chatroom and he says that it does not handle cookies at the moment but you can write a submodule to handle these for you in the meantime.

    in regards to redirect it handles beautifully :)

    0 讨论(0)
提交回复
热议问题