How to build a JSON file with nested records from a flat data table?

前端未结

关注

 2  1059

清歌不尽 2021-01-12 10:49

I\'m looking for a Python technique to build a nested JSON file from a flat table in a pandas data frame. For example how could a pandas data frame table such as:

2条回答

盖世英雄少女心 (楼主)

2021-01-12 11:30

With some input from @root I used a different tack and came up with the following code, which seems to get most of the way there:

import pandas
import json
from collections import defaultdict

inputExcel = 'E:\\teamsMM.xlsx'
exportJson = 'E:\\teamsMM.json'

data = pandas.read_excel(inputExcel, sheetname = 'SCAT Teams', encoding = 'utf8')

grouped = data.groupby(['teamname', 'members']).first()

results = defaultdict(lambda: defaultdict(dict))

for t in grouped.itertuples():
    for i, key in enumerate(t.Index):
        if i ==0:
            nested = results[key]
        elif i == len(t.Index) -1:
            nested[key] = t
        else:
            nested = nested[key]


formattedJson = json.dumps(results, indent = 4)

formattedJson = '{\n"teams": [\n' + formattedJson +'\n]\n }'

parsed = open(exportJson, "w")
parsed.write(formattedJson)

The resulting JSON file is this:

{
"teams": [
{
    "1": {
        "0": [
            [
                1, 
                0
            ], 
            "John", 
            "Doe", 
            "Anon", 
            "916-555-1234", 
            "none", 
            "john.doe@wildlife.net"
        ], 
        "1": [
            [
                1, 
                1
            ], 
            "Jane", 
            "Doe", 
            "Anon", 
            "916-555-4321", 
            "916-555-7890", 
            "jane.doe@wildlife.net"
        ]
    }, 
    "2": {
        "0": [
            [
                2, 
                0
            ], 
            "Mickey", 
            "Moose", 
            "Moosers", 
            "916-555-0000", 
            "916-555-1111", 
            "mickey.moose@wildlife.net"
        ], 
        "1": [
            [
                2, 
                1
            ], 
            "Minny", 
            "Moose", 
            "Moosers", 
            "916-555-2222", 
            "none", 
            "minny.moose@wildlife.net"
        ]
    }
}
]
 }

This format is very close to the desired end product. Remaining issues are: removing the redundant array [1, 0] that appears just above each firstname, and getting the headers for each nest to be "teamname": "1", "members": rather than "1": "0":

Also, I do not know why each record is being stripped of its heading on the conversion. For instance why is dictionary entry "firstname":"John" exported as "John".

0 讨论(0)

查看其它2个回答