How to give URL to scrapy for crawling?

前端 未结 6 724
终归单人心
终归单人心 2020-11-29 01:42

I want to use scrapy for crawling web pages. Is there a way to pass the start URL from the terminal itself?

It is given in the documentation that either the name of

6条回答
  •  爱一瞬间的悲伤
    2020-11-29 02:18

    I'm not really sure about the commandline option. However, you could write your spider like this.

    class MySpider(BaseSpider):
    
        name = 'my_spider'    
    
        def __init__(self, *args, **kwargs): 
          super(MySpider, self).__init__(*args, **kwargs) 
    
          self.start_urls = [kwargs.get('start_url')] 
    

    And start it like: scrapy crawl my_spider -a start_url="http://some_url"

提交回复
热议问题