I want to flatten JSON column in a Pandas DataFrame

谁说胖子不能爱 提交于 2019-11-28 09:20:44

问题


I have an input dataframe df which is as follows:

id  e
1   {"k1":"v1","k2":"v2"}
2   {"k1":"v3","k2":"v4"}
3   {"k1":"v5","k2":"v6"}

I want to "flatten" the column 'e' so that my resultant dataframe is:

id  e.k1    e.k2
1   v1  v2
2   v3  v4
3   v5  v6

How can I do this? I tried using json_normalize but did not have much success


回答1:


Here is a way to use pandas.io.json.json_normalize():

from pandas.io.json import json_normalize
df = df.join(json_normalize(df["e"].tolist()).add_prefix("e.")).drop(["e"], axis=1)
print(df)
#  e.k1 e.k2
#0   v1   v2
#1   v3   v4
#2   v5   v6

However, if you're column is actually a str and not a dict, then you'd first have to map it using json.loads():

import json
df = df.join(json_normalize(df['e'].map(json.loads).tolist()).add_prefix('e.'))\
    .drop(['e'], axis=1)



回答2:


If your column is not already a dictionary, you could use map(json.loads) and apply pd.Series:

s = df['e'].map(json.loads).apply(pd.Series).add_prefix('e.')

Or if it is already a dictionary, you can apply pd.Series directly:

s = df['e'].apply(pd.Series).add_prefix('e.')

Finally use pd.concat to join back the other columns:

>>> pd.concat([df.drop(['e'], axis=1), s], axis=1).set_index('id')    
id e.k1 e.k2
1    v1   v2
2    v3   v4
3    v5   v6


来源:https://stackoverflow.com/questions/49822874/i-want-to-flatten-json-column-in-a-pandas-dataframe

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!