python scrapy how to code the parameter instead of using cmd: use Custom code in Scrapy

亡梦爱人 提交于 2020-01-25 04:51:11

问题


I am using scrapy 0.20 with puthon 2.7

i used to do this in cmd

 -s JOBDIR=crawls/somespider-1

to handle the dublicated items. note please, i already did the changes in setting

I dont' want to use that in cmd.

is there anyway so i can type it in code inside my spider?

thanks


回答1:


It's so easy. Use dropitem in pipelines.py to drop the item. And you can use custom command to code the parameter inside of program.

Here is example of custom code in scrapy

Using the custom command (say : scrapy crawl mycommand)

you can run -s JOBDIR=crawls/somespider-1

Example:

Create a directory commands where you have scrapy.cfg file Inside the directory create a file mycommand.py

from scrapy.command import ScrapyCommand
from scrapy.cmdline import execute



class Command(ScrapyCommand):
    requires_project = True

    def short_desc(self):
        return "This is your custom command"


    def run(self, args, opts):
        args.append('scrapy')
        args.append('crawl')
        args.append('spider')##add what ever your syntax needs.In my case i want to get "scrapy crawl spider" in cmd
        execute(args)#send a list as parameter with command as a single element of it

Now go to cmd line and type scrapy mycommand. Then your magic is ready :-)



来源:https://stackoverflow.com/questions/22131980/python-scrapy-how-to-code-the-parameter-instead-of-using-cmd-use-custom-code-in

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!