Scrapy with a nested array

强颜欢笑 提交于 2019-12-21 13:57:30

问题


I'm new to scrapy and would like to understand how to scrape on object for output into nested JSON. Right now, I'm producing JSON that looks like

[
{'a' : 1, 
'b' : '2',
'c' : 3},
]

And I'd like it more like this:

[
{ 'a' : '1',
'_junk' : [
     'b' : 2,
     'c' : 3]},
]

---where I put some stuff in _junk subfields to post-process later.

The current code under the parser definition file in my scrapername.py is...

item['a'] = x
item['b'] = y
item['c'] = z

And it seemed like

item['a'] = x
item['_junk']['b'] = y
item['_junk']['c'] = z

---might fix that, but I'm getting an error about the _junk key:

  File "/usr/local/lib/python2.7/dist-packages/scrapy/item.py", line 49, in __getitem__
    return self._values[key]
exceptions.KeyError: '_junk'

Does this mean I need to change my items.py somehow? Currently I have:

class Website(Item):
    a = Field()
    _junk = Field()
    b = Field()
    c = Field()

回答1:


You need to create the junk dictionary before storing items in it.

item['a'] = x
item['_junk'] = {}
item['_junk']['b'] = y
item['_junk']['c'] = z


来源:https://stackoverflow.com/questions/15507651/scrapy-with-a-nested-array

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!