I\'m using Python 2 to parse JSON from ASCII encoded text files.
When loading these files with either json or simplejson, all my
You can use the object_hook
parameter for json.loads to pass in a converter. You don't have to do the conversion after the fact. The json module will always pass the object_hook
dicts only, and it will recursively pass in nested dicts, so you don't have to recurse into nested dicts yourself. I don't think I would convert unicode strings to numbers like Wells shows. If it's a unicode string, it was quoted as a string in the JSON file, so it is supposed to be a string (or the file is bad).
Also, I'd try to avoid doing something like str(val)
on a unicode
object. You should use value.encode(encoding)
with a valid encoding, depending on what your external lib expects.
So, for example:
def _decode_list(data):
rv = []
for item in data:
if isinstance(item, unicode):
item = item.encode('utf-8')
elif isinstance(item, list):
item = _decode_list(item)
elif isinstance(item, dict):
item = _decode_dict(item)
rv.append(item)
return rv
def _decode_dict(data):
rv = {}
for key, value in data.iteritems():
if isinstance(key, unicode):
key = key.encode('utf-8')
if isinstance(value, unicode):
value = value.encode('utf-8')
elif isinstance(value, list):
value = _decode_list(value)
elif isinstance(value, dict):
value = _decode_dict(value)
rv[key] = value
return rv
obj = json.loads(s, object_hook=_decode_dict)