getting Forbidden by robots.txt: scrapy

孤街浪徒 提交于 2019-11-28 06:43:23

In the new version (scrapy 1.1) launched 2016-05-11 the crawl first downloads robots.txt before crawling. To change this behavior change in your settings.py with ROBOTSTXT_OBEY

ROBOTSTXT_OBEY=False

Here are the release notes

First thing you need to ensure is that you change your user agent in the request, otherwise default user agent will be blocked for sure.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!