Python Pandas merge multiple columns into a dictionary column

女生的网名这么多〃 提交于 2021-02-07 20:17:51

问题


I have a dataframe (df_full) like so:

|cust_id|address    |store_id|email        |sales_channel|category|
-------------------------------------------------------------------
|1234567|123 Main St|10SjtT  |idk@gmail.com|ecom         |direct  |
|4567345|345 Main St|10SjtT  |101@gmail.com|instore      |direct  |
|1569457|876 Main St|51FstT  |404@gmail.com|ecom         |direct  |

and I would like to combine the last 4 fields into one metadata field that is a dictionary like so:

|cust_id|address    |metadata                                                                                     |
-------------------------------------------------------------------------------------------------------------------
|1234567|123 Main St|{'store_id':'10SjtT', 'email':'idk@gmail.com','sales_channel':'ecom', 'category':'direct'}   |
|4567345|345 Main St|{'store_id':'10SjtT', 'email':'101@gmail.com','sales_channel':'instore', 'category':'direct'}|
|1569457|876 Main St|{'store_id':'51FstT', 'email':'404@gmail.com','sales_channel':'ecom', 'category':'direct'}   |

is that possible? I've seen a few solutions around on stack overflow but none of them address combining more than 2 fields into a dictionary field.


回答1:


Use to_dict,

columns = ['store_id', 'email', 'sales_channel', 'category']
df['metadata'] = df[columns].to_dict(orient='records')

And if you want to drop original columns,

df = df.drop(columns=columns)



回答2:


set_index

df.set_index(['cust_id', 'address']).apply(dict, axis=1).reset_index(name='metadata')

   cust_id      address                                           metadata
0  1234567  123 Main St  {'store_id': '10SjtT', 'email': 'idk@gmail.com...
1  4567345  345 Main St  {'store_id': '10SjtT', 'email': '101@gmail.com...
2  1569457  876 Main St  {'store_id': '51FstT', 'email': '404@gmail.com...

comprehension

dat = [(c, a, dict(zip([*df][2:], m))) for c, a, *m in zip(*map(df.get, df))]
pd.DataFrame(dat, df.index, [*df][:2] + ['metadata'])

   cust_id      address                                           metadata
0  1234567  123 Main St  {'store_id': '10SjtT', 'email': 'idk@gmail.com...
1  4567345  345 Main St  {'store_id': '10SjtT', 'email': '101@gmail.com...
2  1569457  876 Main St  {'store_id': '51FstT', 'email': '404@gmail.com...


来源:https://stackoverflow.com/questions/59741934/python-pandas-merge-multiple-columns-into-a-dictionary-column

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!