Scrapy returning scraped values into an array

北战南征 提交于 2021-02-19 07:06:27

问题


Scrapy seems to be pulling the data out correctly, but is formatting the output in my JSON object as if it were an array:

[{"price": ["$34"], "link": ["/product/product..."], "name": ["productname"]},
{"price": ["$37"], "link": ["/product/product"]...

My spider class looks like this:

def parse(self, response):
    sel = Selector(response)
    items = sel.select('//div/ul[@class="product"]')
    skateboards = []
    for item in items:
        skateboard = SkateboardItem()
        skateboard['name'] = item.xpath('li[@class="desc"]//text()').extract()
        skateboard['price'] = item.xpath('li[@class="price"]"]//text()[1]').extract()
        skateboard['link'] = item.xpath('li[@class="image"]').extract()
        skateboards.append(skateboard)
    return skateboards

How would I go about ensuring that Scrapy is only outputting a single value for each key, rather than the array it's currently producing?


回答1:


.extract()  

always returns a list you can use

''.join(item.xpath('li[@class="desc"]//text()').extract())

to get a string




回答2:


Use:
1 .extract_first() or
2 .extract()[0]

to get data in string format.

PS: using Scrapy 1.2



来源:https://stackoverflow.com/questions/23490643/scrapy-returning-scraped-values-into-an-array

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!