问题
All,
I'm trying to understand how to handle a list of Dicts using pyparsing. I've gone back to the example JSON parser for best practices but I've found that it can't handle a list of dicts either!
Consider the following (this is the stock example JSON parser, but with some comments removed and my test case instead of the default one):
#!/usr/bin/env python2.7
from pyparsing import *
TRUE = Keyword("true").setParseAction( replaceWith(True) )
FALSE = Keyword("false").setParseAction( replaceWith(False) )
NULL = Keyword("null").setParseAction( replaceWith(None) )
jsonString = dblQuotedString.setParseAction( removeQuotes )
jsonNumber = Combine( Optional('-') + ( '0' | Word('123456789',nums) ) +
Optional( '.' + Word(nums) ) +
Optional( Word('eE',exact=1) + Word(nums+'+-',nums) ) )
jsonObject = Forward()
jsonValue = Forward()
jsonElements = delimitedList( jsonValue )
jsonArray = Group(Suppress('[') + Optional(jsonElements) + Suppress(']') )
jsonValue << ( jsonString | jsonNumber | Group(jsonObject) | jsonArray | TRUE | FALSE | NULL )
memberDef = Group( jsonString + Suppress(':') + jsonValue )
jsonMembers = delimitedList( memberDef )
jsonObject << Dict( Suppress('{') + Optional(jsonMembers) + Suppress('}') )
jsonComment = cppStyleComment
jsonObject.ignore( jsonComment )
def convertNumbers(s,l,toks):
n = toks[0]
try:
return int(n)
except ValueError, ve:
return float(n)
jsonNumber.setParseAction( convertNumbers )
if __name__ == "__main__":
testdata = """
[ { "foo": "bar", "baz": "bar2" },
{ "foo": "bob", "baz": "fez" } ]
"""
results = jsonValue.parseString(testdata)
print "[0]:", results[0].dump()
print "[1]:", results[1].dump()
This is valid JSON, but the pyparsing example fails when trying to index into the second expected array element:
[0]: [[['foo', 'bar'], ['baz', 'bar2']], [['foo', 'bob'], ['baz', 'fez']]]
[1]:
Traceback (most recent call last):
File "json2.py", line 42, in <module>
print "[1]:", results[1].dump()
File "/Library/Python/2.7/site-packages/pyparsing.py", line 317, in __getitem__
return self.__toklist[i]
IndexError: list index out of range
Can anyone help me in identifying what's wrong with this grammar?
EDIT: Fixed bug in trying to parse as JSON Object, not value.
Note: This is related to: pyparsing: grammar for list of Dictionaries (erlang) where I'm basically trying to do the same with an Erlang data structure, and failing in a similiar way :(
回答1:
The parse results object that you get back from this expression is a list of the matched tokens - pyparsing doesn't know if you are going to match one or many tokens, so it returns a list, in your case of list containing 1 element, the array of dicts.
Change
results = jsonValue.parseString(testdata)
to
results = jsonValue.parseString(testdata)[0]
and I think things will start to look better. After doing this, I get:
[0]: [['foo', 'bar'], ['baz', 'bar2']]
- baz: bar2
- foo: bar
[1]: [['foo', 'bob'], ['baz', 'fez']]
- baz: fez
- foo: bob
回答2:
This may be valid JSON, but your grammar won't handle it. Here's why:
jsonObject << Dict( Suppress('{') + Optional(jsonMembers) + Suppress('}') )
This says the grammar object must be surrounded by {...}
. You are bracing it as an array [...]
. Since the top-level object must be a dictionary, it will need key names. Changing your test data to:
{ "col1":{ "foo": "bar", "baz": "bar2" },
"col2":{ "foo": "bob", "baz": "fez" } }
or
{ "data":[{ "foo": "bar", "baz": "bar2" },
{ "foo": "bob", "baz": "fez" }] }
will allow this grammar to parse it. Want a top-level object to be an array? Just modify the grammar!
来源:https://stackoverflow.com/questions/22081239/pyparsing-example-json-parser-fails-for-list-of-dicts