How to give URL to scrapy for crawling?

前端 未结 6 739
终归单人心
终归单人心 2020-11-29 01:42

I want to use scrapy for crawling web pages. Is there a way to pass the start URL from the terminal itself?

It is given in the documentation that either the name of

6条回答
  •  谎友^
    谎友^ (楼主)
    2020-11-29 02:13

    An even easier way to allow multiple url-arguments than what Peter suggested is by giving them as a string with the urls separated by a comma, like this:

    -a start_urls="http://example1.com,http://example2.com"
    

    In the spider you would then simply split the string on ',' and get an array of urls:

    self.start_urls = kwargs.get('start_urls').split(',')
    

提交回复
热议问题