Scrapy:How to print request referrer

后端 未结 2 736
遇见更好的自我
遇见更好的自我 2020-12-15 20:05

Is it possible to get the request referrer from the response object in parse function?

10x

2条回答
  •  自闭症患者
    2020-12-15 20:31

    The question above was asked a long time ago, and it has been answered well.

    However, I thought I would add a different answer in case the answer by Rostyslav Dzinko does not apply/work in your case.

    Let's say that you have 2 different parser methods:

    1. one parser (Let's call it parser_A) simply parses the list of articles (list page) to extract link info and others.
    2. Another parser (Let's call it parser_B) extracts article info from the target article (article page).

    If you cannot get the url (referer url) for the list of articles (list page) once you are in the parser_B, you can set headers field in parser_A, then send it to parser_B as the following example:

    yield scrapy.Request(url=article_page_url, callback=self.parser_B, dont_filter=True, headers={'referer_url': list_page_url})
    

    And, in parser_B method, you can do the following to obtain the list page's url:

    print(response.request.headers.get('referer_url'))
    

    Hope this helps those who needed help.

提交回复
热议问题