I\'m looking for a Python technique to build a nested JSON file from a flat table in a pandas data frame. For example how could a pandas data frame table such as:
With some input from @root I used a different tack and came up with the following code, which seems to get most of the way there:
import pandas
import json
from collections import defaultdict
inputExcel = 'E:\\teamsMM.xlsx'
exportJson = 'E:\\teamsMM.json'
data = pandas.read_excel(inputExcel, sheetname = 'SCAT Teams', encoding = 'utf8')
grouped = data.groupby(['teamname', 'members']).first()
results = defaultdict(lambda: defaultdict(dict))
for t in grouped.itertuples():
for i, key in enumerate(t.Index):
if i ==0:
nested = results[key]
elif i == len(t.Index) -1:
nested[key] = t
else:
nested = nested[key]
formattedJson = json.dumps(results, indent = 4)
formattedJson = '{\n"teams": [\n' + formattedJson +'\n]\n }'
parsed = open(exportJson, "w")
parsed.write(formattedJson)
The resulting JSON file is this:
{
"teams": [
{
"1": {
"0": [
[
1,
0
],
"John",
"Doe",
"Anon",
"916-555-1234",
"none",
"john.doe@wildlife.net"
],
"1": [
[
1,
1
],
"Jane",
"Doe",
"Anon",
"916-555-4321",
"916-555-7890",
"jane.doe@wildlife.net"
]
},
"2": {
"0": [
[
2,
0
],
"Mickey",
"Moose",
"Moosers",
"916-555-0000",
"916-555-1111",
"mickey.moose@wildlife.net"
],
"1": [
[
2,
1
],
"Minny",
"Moose",
"Moosers",
"916-555-2222",
"none",
"minny.moose@wildlife.net"
]
}
}
]
}
This format is very close to the desired end product. Remaining issues are: removing the redundant array [1, 0] that appears just above each firstname, and getting the headers for each nest to be "teamname": "1", "members": rather than "1": "0":
Also, I do not know why each record is being stripped of its heading on the conversion. For instance why is dictionary entry "firstname":"John" exported as "John".