问题
I'm trying to use scrapy to make a web scraper but I'm running into many problems since it uses Python2. is it possible to run the 2to3 command on all the files in the tarball simultaneously? Would that cause unforseen errors? Is there an alternative web scraper framework which is more up to date, more functional that might be recommended in stead?
I say that because there doesn't seem to be much recent activity on forms on the problems inherent with running version 0.24 of scrapy, i.e. the fact that it's written in python 2.
If scrappy is the best choice, and porting is a bad idea, what's the best way to run this on my python3 oriented machine? a command to run it only with python 2 or something i can change in a config file or whatnot.
UPDATE
If you have such problems what you need to do is:
simply run the setup.py script with python2, i.e.,
python2 setup.py install
and you're good to go, after that it'll work.
^as indicated by @alecxe
回答1:
The problem with porting Scrapy to Python 3 is that Scrapy is built-in on top of the twisted event-driven framework, which currently is not yet there.
There is no web-scraping framework as big and mature as Scrapy on Python 3. Though, pyspider looks promising, but it is a bit different, see:
- Can Scrapy be replaced by pyspider?
Also, there are other libraries related to web-scraping and html-parsing that support Python 3:
- beautifulsoup4
- lxml
- requests
- MechanicalSoup (built on top of
requestsandBeautifulSoup) - selenium
来源:https://stackoverflow.com/questions/28390386/port-web-scraper-scrapy-0-24-to-python-3-or-scrap-scrapy-for-something-better