I am using Python.org version 2.7 64 bit on Windows Vista 64 bit. I have been testing the following Scrapy code to recursively scrape all the pages at the site www.whoscored
I do not if this still available, but I have to put the next lines in the setting.py file:
HTTPERROR_ALLOWED_CODES =[404]
USER_AGENT = 'quotesbot (+http://www.yourdomain.com)'
USER_AGENT = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.131 Safari/537.36"
hope it helps.