问题
I am trying to normalize a json file that looks like this (a small snippet):
[{'trimestre': 'A2000',
'cours': [{"sigle":"TECH 20701", "titre":"La cybersécurité et le gestionnaire",'etudiants': [{'matricule': '22000803',
'nom': 'Boyer,André',
'note': 'C+',
'valeur': 2.3},
{'matricule': '22000829',
'nom': 'Keighan,Maylis',
'note': 'A+',
'valeur': 4.3},
{'matricule': '22000869',
'nom': 'Lahaie,Lyes',
'note': 'B+',
'valeur': 3.3},
{'matricule': '22000973',
'nom': 'Conerardy,Rawaa',
'note': 'B+',
'valeur': 3.3},
]}]
Im trying to get a table that will look like this:
**"trimestre"** (columns)
**"sigle" + "titre"** (index): *valeur*
import pandas as pd
import json
import numpy as np
from pandas.io.json import json_normalize
data = pd.read_json('DataTP2.json')
print(data)
I tried using the normalize function like this
result = json_normalize(data, 'cours',['trimestre'])
print(result)
But I am getting an error: TypeError: string indices must be integers
Basically I want "sigle" + "titre" (from "cours") as an index, "trimestre" as a column and the mean value of "valeur" as values in the table.
Thanks in advance!
回答1:
Here you are :
from collections import defaultdict
import json
with open("data.json", "r") as f:
data = json.load(f)
test = [{'trimestre': 'A2000',
'cours': [{"sigle":"TECH 20701", "titre":"La cybersécurité et le gestionnaire",'etudiants': [{'matricule': '22000803',
'nom': 'Boyer,André',
'note': 'C+',
'valeur': 2.3},
{'matricule': '22000829',
'nom': 'Keighan,Maylis',
'note': 'A+',
'valeur': 4.3},
{'matricule': '22000869',
'nom': 'Lahaie,Lyes',
'note': 'B+',
'valeur': 3.3},
{'matricule': '22000973',
'nom': 'Conerardy,Rawaa',
'note': 'B+',
'valeur': 3.3},
]}]}]
results = defaultdict(list)
for trimestre in data:
results["trimestre"].append(trimestre["trimestre"])
for cours in trimestre["cours"]:
results["index"].append(f"{cours['sigle']} {cours['titre']}")
results["valeur"].append(cours["sigle"])
df = pd.DataFrame(results["valeur"], columns=results["trimestre"], index=results["index"])
Results
>>> print(df)
A2000
TECH 20701 La cybersécurité et le gestionnaire TECH 20701
来源:https://stackoverflow.com/questions/58772980/normalize-a-deeply-nested-json-in-pandas