发表新帖

发表新帖

Screen scraping: getting around “HTTP Error 403: request disallowed by robots.txt”

前端未结

关注

 8  1126

借酒劲吻你 2020-12-12 17:15

Is there a way to get around the following?

httperror_seek_wrapper: HTTP Error 403: request disallowed by robots.txt

Is the only way around

8条回答

温柔的废话 (楼主)

2020-12-12 17:44

The error you're receiving is not related to the user agent. mechanize by default checks robots.txt directives automatically when you use it to navigate to a site. Use the .set_handle_robots(false) method of mechanize.browser to disable this behavior.

0 讨论(0)

查看其它8个回答
发布评论:

提交评论
- 加载中...

热议问题