How to get string objects instead of Unicode from JSON?

前端 未结 21 1083
伪装坚强ぢ
伪装坚强ぢ 2020-11-22 14:43

I\'m using Python 2 to parse JSON from ASCII encoded text files.

When loading these files with either json or simplejson, all my

21条回答
  •  [愿得一人]
    2020-11-22 15:34

    You can use the object_hook parameter for json.loads to pass in a converter. You don't have to do the conversion after the fact. The json module will always pass the object_hook dicts only, and it will recursively pass in nested dicts, so you don't have to recurse into nested dicts yourself. I don't think I would convert unicode strings to numbers like Wells shows. If it's a unicode string, it was quoted as a string in the JSON file, so it is supposed to be a string (or the file is bad).

    Also, I'd try to avoid doing something like str(val) on a unicode object. You should use value.encode(encoding) with a valid encoding, depending on what your external lib expects.

    So, for example:

    def _decode_list(data):
        rv = []
        for item in data:
            if isinstance(item, unicode):
                item = item.encode('utf-8')
            elif isinstance(item, list):
                item = _decode_list(item)
            elif isinstance(item, dict):
                item = _decode_dict(item)
            rv.append(item)
        return rv
    
    def _decode_dict(data):
        rv = {}
        for key, value in data.iteritems():
            if isinstance(key, unicode):
                key = key.encode('utf-8')
            if isinstance(value, unicode):
                value = value.encode('utf-8')
            elif isinstance(value, list):
                value = _decode_list(value)
            elif isinstance(value, dict):
                value = _decode_dict(value)
            rv[key] = value
        return rv
    
    obj = json.loads(s, object_hook=_decode_dict)
    

提交回复
热议问题