Scrapy or Selenium or Mechanize to scrape web data?

天大地大妈咪最大 提交于 2019-12-04 17:46:11

问题


I want to scrape some data from a website.

Basically, the website has some tabular display and shows around 50 records. For more records, the user has to click some button which makes an ajax call get & show the next 50 records.

I have previous knowledge of Selenium webdriver(Python). I can do this very quickly in Selenium. But, Selenium is more kind of automation testing tool and it is very slow.

I did some R&D and found that using Scrapy or Mechanize, I can also do the same thing.

Should I go for Scrapy or Mechanize or Selenium for this ?


回答1:


I would recommend you to go with a combination of Mechanize and ExecJS (https://github.com/sstephenson/execjs) to execute any javascript requests you might come across. I have used those two gems in combination for quite some time now and they do a great job.

You should choose this instead of Selenium, because it it will be a lot faster compared to having to render the entire page in a headless browser.




回答2:


Definitely I'd choose Scrapy. If you can't handle javascript you can try with Scrapy + splash. Scrapy is by far the fastest tool for web scraping that I'm aware of. Good luck!



来源:https://stackoverflow.com/questions/20939401/scrapy-or-selenium-or-mechanize-to-scrape-web-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!