Python flatten multilevel/nested JSON

后端 未结 7 672
隐瞒了意图╮
隐瞒了意图╮ 2020-12-03 05:23

I am trying to convert JSON to CSV file, that I can use for further analysis. Issue with my structure is that I have quite some nested dict/lists when I convert my JSON file

7条回答
  •  天涯浪人
    2020-12-03 06:21

    Cross-posting (but then adapting further) from https://stackoverflow.com/a/62186053/4355695 : In this repo: https://github.com/ScriptSmith/socialreaper/blob/master/socialreaper/tools.py#L8 , I found an implementation of the list-inclusion comment by @roneo to the answer posted by @Imran.

    I've added checks to it for catching empty lists and empty dicts. And also added print lines that will help one understand precisely how this function works. You can turn off those print statemenents by setting crumbs=False

    import collections
    crumbs = True
    def flatten(dictionary, parent_key=False, separator='.'):
        """
        Turn a nested dictionary into a flattened dictionary
        :param dictionary: The dictionary to flatten
        :param parent_key: The string to prepend to dictionary's keys
        :param separator: The string used to separate flattened keys
        :return: A flattened dictionary
        """
    
        items = []
        for key, value in dictionary.items():
            if crumbs: print('checking:',key)
            new_key = str(parent_key) + separator + key if parent_key else key
            if isinstance(value, collections.MutableMapping):
                if crumbs: print(new_key,': dict found')
                if not value.items():
                    if crumbs: print('Adding key-value pair:',new_key,None)
                    items.append((new_key,None))
                else:
                    items.extend(flatten(value, new_key, separator).items())
            elif isinstance(value, list):
                if crumbs: print(new_key,': list found')
                if len(value):
                    for k, v in enumerate(value):
                        items.extend(flatten({str(k): v}, new_key).items())
                else:
                    if crumbs: print('Adding key-value pair:',new_key,None)
                    items.append((new_key,None))
            else:
                if crumbs: print('Adding key-value pair:',new_key,value)
                items.append((new_key, value))
        return dict(items)
    

    Test it:

    ans = flatten({'a': 1, 'c': {'a': 2, 'b': {'x': 5, 'y' : 10}}, 'd': [1, 2, 3], 'e':{'f':[], 'g':{}} })
    print('\nflattened:',ans)
    

    Output:

    checking: a
    Adding key-value pair: a 1
    checking: c
    c : dict found
    checking: a
    Adding key-value pair: c.a 2
    checking: b
    c.b : dict found
    checking: x
    Adding key-value pair: c.b.x 5
    checking: y
    Adding key-value pair: c.b.y 10
    checking: d
    d : list found
    checking: 0
    Adding key-value pair: d.0 1
    checking: 1
    Adding key-value pair: d.1 2
    checking: 2
    Adding key-value pair: d.2 3
    checking: e
    e : dict found
    checking: f
    e.f : list found
    Adding key-value pair: e.f None
    checking: g
    e.g : dict found
    Adding key-value pair: e.g None
    
    flattened: {'a': 1, 'c.a': 2, 'c.b.x': 5, 'c.b.y': 10, 'd.0': 1, 'd.1': 2, 'd.2': 3, 'e.f': None, 'e.g': None}
    

    Annd that does the job I need done: I throw any complicated json at this and it flattens it out for me. I added a check to the original code to handle empty lists too

    Credits to https://github.com/ScriptSmith whose repo I found the intial flatten function in.

    Testing OP's sample json, here's the output:

    {'count': 13,
     'virtualmachine.0.id': '1082e2ed-ff66-40b1-a41b-26061afd4a0b',
     'virtualmachine.0.name': 'test-2',
     'virtualmachine.0.displayname': 'test-2',
     'virtualmachine.0.securitygroup.0.id': '9e649fbc-3e64-4395-9629-5e1215b34e58',
     'virtualmachine.0.securitygroup.0.name': 'test',
     'virtualmachine.0.securitygroup.0.tags': None,
     'virtualmachine.0.nic.0.id': '79568b14-b377-4d4f-b024-87dc22492b8e',
     'virtualmachine.0.nic.0.networkid': '05c0e278-7ab4-4a6d-aa9c-3158620b6471',
     'virtualmachine.0.nic.1.id': '3d7f2818-1f19-46e7-aa98-956526c5b1ad',
     'virtualmachine.0.nic.1.networkid': 'b4648cfd-0795-43fc-9e50-6ee9ddefc5bd',
     'virtualmachine.0.nic.1.traffictype': 'Guest',
     'virtualmachine.0.hypervisor': 'KVM',
     'virtualmachine.0.affinitygroup': None,
     'virtualmachine.0.isdynamicallyscalable': False}
    

    So you'll see that 'tags' and 'affinitygroup' keys are also handled and added to output. Original code was omitting them.

提交回复
热议问题