HTTP 403 Responses when using Python Scrapy

后端 未结 2 791
后悔当初
后悔当初 2020-12-31 14:24

I am using Python.org version 2.7 64 bit on Windows Vista 64 bit. I have been testing the following Scrapy code to recursively scrape all the pages at the site www.whoscored

2条回答
  •  一个人的身影
    2020-12-31 14:44

    I do not if this still available, but I have to put the next lines in the setting.py file:

    HTTPERROR_ALLOWED_CODES  =[404]
    USER_AGENT = 'quotesbot (+http://www.yourdomain.com)'
    USER_AGENT = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.131 Safari/537.36"
    

    hope it helps.

提交回复
热议问题