Scrapy redirect_urls exception.KeyError

谁都会走 提交于 2021-01-03 06:07:03

问题


I am new to Scrapy & Python, recently launched my first spider. There is a feature that seems to have worked before though now it only works for some of the websites I am trying to scrap.

The code line is:

item['url_direct'] = response.request.meta['redirect_urls']

and the error I get is:

exceptions.KeyError: 'redirect_urls'

I have been struggling with this for a while so any clue or hopefully a more detailed answer will be very much appreciated. (Didn't find a similar question here or on the web).


回答1:


So, response.request.meta['redirect_urls'] is set by the RedirectMiddleware to any URLs that the request may have gone through while being redirected.

For requests that haven't been redirected, that code will fail with a KeyError.

Since response.request.meta is just a dict, you can use:

item['url_direct'] = response.request.meta.get('redirect_urls')

Or you can check it before setting:

if 'redirect_urls' in response.request.meta:
    item['url_direct'] = response.request.meta['redirect_urls']

See also:

  • RedirectMiddleware docs
  • how to get the original start_url in scrapy (before redirect) (related question)


来源:https://stackoverflow.com/questions/29779534/scrapy-redirect-urls-exception-keyerror

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!