Series of if statements applied to data frame

♀尐吖头ヾ 提交于 2019-12-11 13:26:22

问题


I have a question on how to this task. I want to return or group a series of numbers in my data frame, the numbers are from the column 'PD' which ranges from .001 to 1. What I want to do is to group those that are .91>'PD'>.9 to .91 (or return a value of .91), .92>'PD'>=.91 to .92, ..., 1>='PD' >=.99 to 1. onto a column named 'Grouping'. What I have been doing is manually doing each if statement then merging it with the base data frame. Can anyone please help me with a more efficient way of doing this? Still on the early stages of using python. Sorry if the question seems to be easy. Thank you for answering and for your time.


回答1:


Let your data look like this

>>> df = pd.DataFrame({'PD': np.arange(0.001, 1, 0.001), 'data': np.random.randint(10, size=999)})
>>> df.head()
      PD  data
0  0.001     6
1  0.002     3
2  0.003     5
3  0.004     9
4  0.005     7

Then cut-off the last decimal of the PD column. This is a bit tricky since you get a lot of issues with rounding when doing it without str conversion. E.g.

>>> df['PD'] = df['PD'].apply(lambda x: float('{:.3f}'.format(x)[:-1]))
>>> df.tail()
       PD  data
994  0.99     1
995  0.99     3
996  0.99     2
997  0.99     1
998  0.99     0

Now you can use the pandas-groupby. Do with data whatever you want, e.g.

>>> df.groupby('PD').agg(lambda x: ','.join(map(str, x)))
                     data
PD                       
0.00    6,3,5,9,7,3,6,8,4
0.01  3,5,7,0,4,9,7,1,7,1
0.02  0,0,9,1,5,4,1,6,7,3
0.03  4,4,6,4,6,5,4,4,2,1
0.04  8,3,1,4,6,5,0,6,0,5
[...]

Note that the first row is one item shorter due to missing 0.000 in my sample.



来源:https://stackoverflow.com/questions/51316631/series-of-if-statements-applied-to-data-frame

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!