How do I package a Scrapy script into a standalone application?

雨燕双飞 提交于 2019-12-12 20:43:35

问题


I have a set of Scrapy spiders. They need to be run daily from a desktop application. What is the simplest way (from user's point of view) to install and run it on another windows machine?


回答1:


The simplest way is write a script in python for them I guess...

If you are running a Windows Server you can even schedule the comand that you use (scrapy crawl yoursprider) to run the spiders.




回答2:


Create a script (e.g. run_spider.py) which runs scrapy crawl <spider_name> as a system command.

run_spider.py

from os import system
output_file_name = 'results.csv'
system('scrapy crawl myspider -o ' + output_file_name + ' -t csv')

Then feed that script to PyInstaller:

pyinstaller run_spider.py



回答3:


Here is another possibility to run your spider as standalone script or executable

import scrapy
from scrapy.crawler import CrawlerProcess

class MySpider(scrapy.Spider):
    # Your spider definition
    ...

process = CrawlerProcess({
    'USER_AGENT': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)'
})

process.crawl(MySpider)
process.start() # the script will block here until the crawling is finished

You can find more information here: https://doc.scrapy.org/en/1.0/topics/practices.html



来源:https://stackoverflow.com/questions/18532596/how-do-i-package-a-scrapy-script-into-a-standalone-application

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!