Python can't parse JSON with extra trailing comma

不羁的心 提交于 2019-12-02 04:12:32

问题


This code:

import json
s = '{ "key1": "value1", "key2": "value2", }'
json.loads(s)

produces this error in Python 2:

ValueError: Expecting property name: line 1 column 16 (char 15)

Similar result in Python 3:

json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 16 (char 15)

If I remove that trailing comma (after "value2"), I get no error. But my code will process many different JSONs, so I can't do it manually. Is it possible to setup the parser to ignore such last commas?


回答1:


JSON specification doesn't allow trailing comma. The parser is throwing since it encounters invalid syntax token.

You might be interested in using a different parser for those files, eg. a parser built for JSON5 spec which allows such syntax.




回答2:


It could be that this data stream is JSON5, in which case there's a parser for that: https://pypi.org/project/json5/

This situation can be alleviated by a regex substitution that looks for ", }, and replaces it with " }, allowing for any amount of whitespace between the quotes, comma and close-curly.

>>> import re
>>> s = '{ "key1": "value1", "key2": "value2", }'
>>> re.sub(r"\"\s*,\s*\}", "\" }", s)
'{ "key1": "value1", "key2": "value2" }'

Giving:

>>> import json
>>> s2 = re.sub(r"\"\s*,\s*\}", "\" }", s)
>>> json.loads(s2)
{'key1': 'value1', 'key2': 'value2'}

EDIT: as commented, this is not a good practice unless you are confident your JSON data contains only simple words, and this change is not corrupting the data-stream further. As I commented on the OP, the best course of action is to repair the up-stream data source. But sometimes that's not possible.




回答3:


That's because an extra , is invalid according to JSON standard.

An object is an unordered set of name/value pairs. An object begins with { (left brace) and ends with } (right brace). Each name is followed by : (colon) and the name/value pairs are separated by , (comma).

If you really need this, you could wrap python's json parser with jsoncomment. But I would try to fix JSON in the origin.




回答4:


I suspect it doesn't parse because "it's not json", but you could pre-process strings, using regular expression to replace , } with } and , ] with ]




回答5:


How about use the following regex?

s = re.sub(r",\s*}", "}", s)


来源:https://stackoverflow.com/questions/52636846/python-cant-parse-json-with-extra-trailing-comma

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!