Construct sequences from a dataframe using dictionaries in Python

冷暖自知 提交于 2019-12-25 08:05:18

问题


I would like to construct sequences of user's purchasing history using dictionaries in Python. I would like these sequences to be ordred by date.

I have 3 columns in my dataframe:

users        items         date

1             1            date_1 
1             2            date_2
2             1            date_3
2             3            date_1
4             5            date_2
4             1            date_5
4             3            date_3

And the result should be like this :

{1: [[1,date_1],[2,date_2]], 2:[[3,date_1],[5,date_2],[1,date_3]], 4:[[5,date_2],[3,date_3][1,date_5]]}

My code is :

df_sub = df[['uid', 'nid', 'date']] 
dic3 = df_sub.set_index('uid').T.to_dict('list')

And my results are :

{36864: [258509L, '2014-12-03'], 548873: [502105L, '2015-09-08'], 42327: [492268L, '2015-01-29'], 548873: [370049L, '2015-02-18'], 36864: [258909L, '2016-01-13'] ... }

But I would like to group by users :

 {36864: [[258509L, '2014-12-03'],[258909L, '2016-01-13']], 548873: [[502105L, '2015-09-08'],[370049L, '2015-02-18']], 42327: [492268L, '2015-01-29'] }

Some help, please!


回答1:


Firstly, set users as the index and perform groupby w.r.t that. Then, you could pass a function to sort each group by it's date column and extract it's underlying array part using .values.

Use .tolist to get back it's list equivalent. This gives you in the required format. Finally, use .to_dict to get your final output as a dictionary.

fnc = lambda x: x.sort_values('date').values.tolist()
df.set_index('users').groupby(level=0).apply(fnc).to_dict()

produces:

{1: [[1, 'date_1'], [2, 'date_2']],
 2: [[3, 'date_1'], [1, 'date_3']],
 4: [[5, 'date_2'], [3, 'date_3'], [1, 'date_5']]}


来源:https://stackoverflow.com/questions/41330030/construct-sequences-from-a-dataframe-using-dictionaries-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!