How can I select deeply nested key:values from dictionary in python

后端 未结 2 1070
执笔经年
执笔经年 2020-12-19 21:00

I have downloaded a json data from a website, and I want to select specific key:values from a nested json. I converted the json to python dictionary. Then I used dictionary

相关标签:
2条回答
  • 2020-12-19 21:10

    Here's how you use my find_keys generator from Functions that help to understand json(dict) structure to get the 'id' value from that JSON data, and a few other keys I chose at random. This code gets the JSON data from a string rather than reading it from a file.

    import json
    
    json_data = '''\
    {
        "success": true,
        "payload": {
            "tag": {
                "slug": "python",
                "name": "Python",
                "postCount": 10590,
                "virtuals": {
                    "isFollowing": false
                }
            },
            "metadata": {
                "followerCount": 18053,
                "postCount": 10590,
                "coverImage": {
                    "id": "1*O3-jbieSsxcQFkrTLp-1zw.gif",
                    "originalWidth": 550,
                    "originalHeight": 300
                }
            }
        }
    }
    '''
    
    data = r'data.json'
    
    #def js_r(data):
        #with open(data, encoding='Latin-1') as f_in:
            #return json.load(f_in)
    
    # Read the JSON from the inline json_data string instead of from the data file
    def js_r(data):
        return json.loads(json_data)
    
    def find_key(obj, key):
        if isinstance(obj, dict):
            yield from iter_dict(obj, key, [])
        elif isinstance(obj, list):
            yield from iter_list(obj, key, [])
    
    def iter_dict(d, key, indices):
        for k, v in d.items():
            if k == key:
                yield indices + [k], v
            if isinstance(v, dict):
                yield from iter_dict(v, key, indices + [k])
            elif isinstance(v, list):
                yield from iter_list(v, key, indices + [k])
    
    def iter_list(seq, key, indices):
        for k, v in enumerate(seq):
            if isinstance(v, dict):
                yield from iter_dict(v, key, indices + [k])
            elif isinstance(v, list):
                yield from iter_list(v, key, indices + [k])
    
    if __name__=="__main__":
        # Read the JSON data
        my_dict = js_r(data)
        print("This is the JSON data:")
        print(json.dumps(my_dict, indent=4), "\n")
    
        # Find the id key
        keypath, val = next(find_key(my_dict, "id"))
        print("This is the id: {!r}".format(val))
        print("These are the keys that lead to the id:", keypath, "\n")
    
        # Find the name, followerCount, originalWidth, and originalHeight
        print("Here are some more (key, value) pairs")
        keys = ("name", "followerCount", "originalWidth", "originalHeight")
        for k in keys:
            keypath, val = next(find_key(my_dict, k))
            print("{!r}: {!r}".format(k, val))
    

    output

    This is the JSON data:
    {
        "success": true,
        "payload": {
            "tag": {
                "slug": "python",
                "name": "Python",
                "postCount": 10590,
                "virtuals": {
                    "isFollowing": false
                }
            },
            "metadata": {
                "followerCount": 18053,
                "postCount": 10590,
                "coverImage": {
                    "id": "1*O3-jbieSsxcQFkrTLp-1zw.gif",
                    "originalWidth": 550,
                    "originalHeight": 300
                }
            }
        }
    } 
    
    This is the id: '1*O3-jbieSsxcQFkrTLp-1zw.gif'
    These are the keys that lead to the id: ['payload', 'metadata', 'coverImage', 'id'] 
    
    Here are some more (key, value) pairs
    'name': 'Python'
    'followerCount': 18053
    'originalWidth': 550
    'originalHeight': 300
    

    BTW, JSON normally uses a UTF encoding, not Latin-1. The default encoding is UTF-8, you should be using that, if possible.

    0 讨论(0)
  • 2020-12-19 21:26

    I suggest you to use python-benedict, a solid python dict subclass with full keypath support and many utility methods.

    It provides IO support with many formats, including json.

    You can initialize it directly from the json file:

    from benedict import benedict
    
    d = benedict.from_json('data.json')
    

    Now your dict has keypath support:

    print(d['payload.metadata.coverImage.id'])
    
    # or use get to avoid a possible KeyError
    print(d.get('payload.metadata.coverImage.id'))
    

    Installation: pip install python-benedict

    Here the library repository and the documentation: https://github.com/fabiocaccamo/python-benedict

    Note: I am the author of this project

    0 讨论(0)
提交回复
热议问题