AttributeError: 'Response' object has no attribute 'body_as_unicode' scrapy for python

删除回忆录丶 提交于 2019-12-10 18:05:52

问题


I am working with response in scrapy and keep on getting this message.

I only gave the snippet where the error is occuring. I am trying to go through different webpages and need get the # of pages in that particular webpage. So I created A response object where I get the href for the next button but keep on getting AttributeError: 'Response' object has no attribute 'body_as_unicode'

code working with.

from scrapy.spiders import Spider
from scrapy.selector import Selector
from scrapy.http import Request
from scrapingtest.items import ScrapingTestingItem
from collections import OrderedDict
import json
from scrapy.selector.lxmlsel import HtmlXPathSelector
import csv
import scrapy
from scrapy.http import Response

class scrapingtestspider(Spider):
    name = "scrapytesting"
    allowed_domains = ["tripadvisor.in"]
 #   base_uri = ["tripadvisor.in"]

    def start_requests(self):
        site_array=["http://www.tripadvisor.in/Hotel_Review-g3581633-d2290190-Reviews-Corbett_Treetop_Riverview-Marchula_Jim_Corbett_National_Park_Uttarakhand.html"
                    "http://www.tripadvisor.in/Hotel_Review-g297600-d8029162-Reviews-Daman_Casa_Tesoro-Daman_Daman_and_Diu.html",
                    "http://www.tripadvisor.in/Hotel_Review-g304557-d2519662-Reviews-Darjeeling_Khushalaya_Sterling_Holidays_Resort-Darjeeling_West_Bengal.html",
                    "http://www.tripadvisor.in/Hotel_Review-g319724-d3795261-Reviews-Dharamshala_The_Sanctuary_A_Sterling_Holidays_Resort-Dharamsala_Himachal_Pradesh.html",
                    "http://www.tripadvisor.in/Hotel_Review-g1544623-d8029274-Reviews-Dindi_By_The_Godavari-Nalgonda_Andhra_Pradesh.html"]

        for i in range(len(site_array)):
            response = Response(url=site_array[i])
            sites = Selector(response).xpath('//a[contains(text(), "Next")]/@href').extract()
 #           sites = response.selector.xpath('//a[contains(text(), "Next")]/@href').extract()
            for site in sites:
                yield Request(site_array[i],self.parse)

`


回答1:


In this case the line where your error occurs expects a TextResponse object not a normal response. Try to create a TextResponse instead of the normal Response to resolve the error.

The missing method is documented here.

More specifically use an HtmlResponse because your response would be some HTML and not plain text. HtmlResponse is a subclass of TextResponse so it inherits the missing method.

One more thing: where do you set the body of your Response? Without any body your xpath query will return nothing. As far as in the example in your question you only set the URL but no body. This is why your xpath returns nothing.




回答2:


This does not really answer to this question but can be used to find the problem with the response object returned. I am adding it as an answer so that it might help someone debug the problem they are facing.

I had encountered a similar error: AttributeError: 'HtmlResponse' object has no attribute 'text' when I did:

scrapy shell 'http://example.com'
>>>response.text

To find out what was the problem I checked out the attributes present in the response object returned using:

response.__dict__

However, __dict__ does not return attributes that are attached due to an object's parent class.

The response object that I received had the attribute _body which contained the html for that page.



来源:https://stackoverflow.com/questions/31646988/attributeerror-response-object-has-no-attribute-body-as-unicode-scrapy-for

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!