How to completely traverse a complex dictionary of unknown depth?

后端 未结 7 1953
深忆病人
深忆病人 2020-11-30 17:35

Importing from JSON can get very complex and nested structures. For example:

{u\'body\': [{u\'declarations\': [{u\'id\': {u\'name\': u\'i\',
            


        
相关标签:
7条回答
  • 2020-11-30 18:13

    Some addition to solution above (to handle json including lists)

    #!/usr/bin/env python
    
    import json
    
    def walk(d):
       global path
       for k,v in d.items():
          if isinstance(v, str) or isinstance(v, int) or isinstance(v, float):
             path.append(k)
             print("{}={}".format(".".join(path), v)) 
             path.pop()
          elif v is None:
             path.append(k)
             # do something special
             path.pop()
          elif isinstance(v, list):
             path.append(k)
             for v_int in v:
                walk(v_int)
             path.pop()
          elif isinstance(v, dict):
             path.append(k)
             walk(v)
             path.pop()
          else:
             print("###Type {} not recognized: {}.{}={}".format(type(v), ".".join(path),k, v))
    
    with open('abc.json') as f:
       myjson = json.load(f)
    
    path = []
    walk(myjson)
    
    0 讨论(0)
  • 2020-11-30 18:21

    If you only need to walk the dictionary, I'd suggest using a recursive walk function that takes a dictionary and then recursively walks through its elements. Something like this:

    def walk(node):
        for key, item in node.items():
            if item is a collection:
                walk(item)
            else:
                It is a leaf, do your thing
    

    If you also want to search for elements, or query several elements that pass certain criteria, have a look at the jsonpath module.

    0 讨论(0)
  • 2020-11-30 18:24

    You can use a recursive generator for converting your dictionary to flat lists.

    def dict_generator(indict, pre=None):
        pre = pre[:] if pre else []
        if isinstance(indict, dict):
            for key, value in indict.items():
                if isinstance(value, dict):
                    for d in dict_generator(value, pre + [key]):
                        yield d
                elif isinstance(value, list) or isinstance(value, tuple):
                    for v in value:
                        for d in dict_generator(v, pre + [key]):
                            yield d
                else:
                    yield pre + [key, value]
        else:
            yield pre + [indict]
    

    It returns

    [u'body', u'kind', u'var']
    [u'init', u'declarations', u'body', u'type', u'Literal']
    [u'init', u'declarations', u'body', u'value', 2]
    [u'declarations', u'body', u'type', u'VariableDeclarator']
    [u'id', u'declarations', u'body', u'type', u'Identifier']
    [u'id', u'declarations', u'body', u'name', u'i']
    [u'body', u'type', u'VariableDeclaration']
    [u'body', u'kind', u'var']
    [u'init', u'declarations', u'body', u'type', u'Literal']
    [u'init', u'declarations', u'body', u'value', 4]
    [u'declarations', u'body', u'type', u'VariableDeclarator']
    [u'id', u'declarations', u'body', u'type', u'Identifier']
    [u'id', u'declarations', u'body', u'name', u'j']
    [u'body', u'type', u'VariableDeclaration']
    [u'body', u'kind', u'var']
    [u'init', u'declarations', u'body', u'operator', u'*']
    [u'right', u'init', u'declarations', u'body', u'type', u'Identifier']
    [u'right', u'init', u'declarations', u'body', u'name', u'j']
    [u'init', u'declarations', u'body', u'type', u'BinaryExpression']
    [u'left', u'init', u'declarations', u'body', u'type', u'Identifier']
    [u'left', u'init', u'declarations', u'body', u'name', u'i']
    [u'declarations', u'body', u'type', u'VariableDeclarator']
    [u'id', u'declarations', u'body', u'type', u'Identifier']
    [u'id', u'declarations', u'body', u'name', u'answer']
    [u'body', u'type', u'VariableDeclaration']
    [u'type', u'Program']
    

    UPDATE: Fixed keys list from [key] + pre to pre + [key] as mentioned in comments.

    0 讨论(0)
  • 2020-11-30 18:28

    If you know the meaning of the data, you might want to create a parse function to turn the nested containers into a tree of objects of custom types. You'd then use methods of those custom objects to do whatever you need to do with the data.

    For your example data structure, you might create Program, VariableDeclaration, VariableDeclarator, Identifier, Literal and BinaryExpression classes, then use something like this for your parser:

    def parse(d):
        t = d[u"type"]
    
        if t == u"Program":
            body = [parse(block) for block in d[u"body"]]
            return Program(body)
    
        else if t == u"VariableDeclaration":
            kind = d[u"kind"]
            declarations = [parse(declaration) for declaration in d[u"declarations"]]
            return VariableDeclaration(kind, declarations)
    
        else if t == u"VariableDeclarator":
            id = parse(d[u"id"])
            init = parse(d[u"init"])
            return VariableDeclarator(id, init)
    
        else if t == u"Identifier":
            return Identifier(d[u"name"])
    
        else if t == u"Literal":
            return Literal(d[u"value"])
    
        else if t == u"BinaryExpression":
            operator = d[u"operator"]
            left = parse(d[u"left"])
            right = parse(d[u"right"])
            return BinaryExpression(operator, left, right)
    
        else:
            raise ValueError("Invalid data structure.")
    
    0 讨论(0)
  • 2020-11-30 18:32

    If the accepted answer works for you, but you'd also like a full, ordered path with the numerical index of the nested arrays included, this slight variation will work:

    def dict_generator(indict, pre=None):
        pre = pre[:] if pre else []
        if isinstance(indict, dict):
            for key, value in indict.items():
                if isinstance(value, dict):
                    for d in dict_generator(value,  pre + [key]):
                        yield d
                elif isinstance(value, list) or isinstance(value, tuple):
                    for k,v in enumerate(value):
                        for d in dict_generator(v, pre + [key] + [k]):
                            yield d
                else:
                    yield pre + [key, value]
        else:
            yield indict
    
    0 讨论(0)
  • 2020-11-30 18:38

    Instead of writing your own parser, depending on the task, you could extend encoders and decoders from the standard library json module.

    I recommend this especially if you need to encode objects belonging to custom classes into the json. If you have to do some operation which could be done also on a string representation of the json, consider also iterating JSONEncoder().iterencode

    For both the reference is http://docs.python.org/2/library/json.html#encoders-and-decoders

    0 讨论(0)
提交回复
热议问题