How to web crawl some sites [closed]

时光总嘲笑我的痴心妄想 提交于 2019-12-13 11:18:39

问题


I am starting a new project of crawling websites to retrieve and store data internally using a web service. I looked up some information and came across Scrapy and Beevolve web crawling services.

My question is is it best to just create my own crawler with no prior experience or rent a web crawling service?

One issue that I came across is, some of the websites require a log in before getting any data.


回答1:


If you want to create your own web crawler in Java you may want to look at this

You could also take a look at jSpider and jsoup.

Edit : This could work too : crawler4j



来源:https://stackoverflow.com/questions/23917790/how-to-web-crawl-some-sites

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!