How to log scrapy spiders running from script

笑着哭i 提交于 2019-12-23 02:41:34

问题


Hi all i have multiple spider running from the script. Script will schedule daily once.

  1. I want to log the infos, errors separately. log filename must be a spider_infolog_[date] and spider_errlog_[date] i am trying following code,

spider __init__ file

from twisted.python import log
import logging
LOG_FILE = 'logs/spider.log'
ERR_FILE = 'logs/spider_error.log'
logging.basicConfig(level=logging.INFO, filemode='w+', filename=LOG_FILE)
logging.basicConfig(level=logging.ERROR, filemode='w+', filename=ERR_FILE)
observer = log.PythonLoggingObserver()
observer.start()

within spider:

import logging
.
.
.
logging.error(message)
  1. if any exception happens in spider code [like i am fetching start urls from the MysqlDB, if the connection fails i need to close the specific spider not other spiders because i am running all spiders from the script]

    raise CloseSpider(message)

is above code sufficent to close the particular spider ?

EDIT @eLRuLL

import logging
from scrapy.utils.log import configure_logging
LOG_FILE = 'logs/spider.log'
ERR_FILE = 'logs/spider_error.log'
configure_logging()
logging.basicConfig(level=logging.INFO, filemode='w+', filename=LOG_FILE)
logging.basicConfig(level=logging.ERROR, filemode='w+', filename=ERR_FILE)

i have put the above code in a script that schedules spiders. not working file not created but in console i got log messages.

EDIT 2

i have added install_root_handler=False in configure_logging() it gives all the console output in spider.log file error is not differenciated.

configure_logging(install_root_handler=False)

回答1:


You can do this:

from scrapy import cmdline

cmdline.execute("scrapy crawl myspider --logfile mylog.log".split())

Put that script in the path where you put scrapy.cfg



来源:https://stackoverflow.com/questions/33825930/how-to-log-scrapy-spiders-running-from-script

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!