问题
I have a df with different dicts as entries in a column, in my case column "information". I would like to expand the df by all possible dict.keys(), something like that:
import pandas as pd
import numpy as np
df = pd.DataFrame({'id': pd.Series([1, 2, 3, 4, 5]),
'name': pd.Series(['banana',
'apple',
'orange',
'strawberry' ,
'toast']),
'information': pd.Series([{'shape':'curve','color':'yellow'},
{'color':'red'},
{'shape':'round'},
{'amount':500},
np.nan]),
'cost': pd.Series([1,2,2,10,4])})
id name information cost
0 1 banana {'shape': 'curve', 'color': 'yellow'} 1
1 2 apple {'color': 'red'} 2
2 3 orange {'shape': 'round'} 2
3 4 strawberry {'amount': 500} 10
4 5 toast NaN 4
Should look like this:
id name shape color amount cost
0 1 banana curve yellow NaN 1
1 2 apple NaN red NaN 2
2 3 orange round NaN NaN 2
3 4 strawberry NaN NaN 500.0 10
4 5 toast NaN NaN NaN 4
回答1:
Another approach would be using pandas.DataFrame.from_records
:
import pandas as pd
new = pd.DataFrame.from_records(df.pop('information').apply(lambda x: {} if pd.isna(x) else x))
new = pd.concat([df, new], 1)
print(new)
Output:
cost id name amount color shape
0 1 1 banana NaN yellow curve
1 2 2 apple NaN red NaN
2 2 3 orange NaN NaN round
3 10 4 strawberry 500.0 NaN NaN
4 4 5 toast NaN NaN NaN
回答2:
You can use:
d = {k: {} if v != v else v for k, v in df.pop('information').items()}
df1 = pd.DataFrame.from_dict(d, orient='index')
df = pd.concat([df, df1], axis=1)
print(df)
id name cost shape color amount
0 1 banana 1 curve yellow NaN
1 2 apple 2 NaN red NaN
2 3 orange 2 round NaN NaN
3 4 strawberry 10 NaN NaN 500.0
4 5 toast 4 NaN NaN NaN
来源:https://stackoverflow.com/questions/57686812/how-to-expand-a-df-by-different-dict-as-columns