Numpy Where with more than 2 conditions

可紊 提交于 2021-02-05 05:52:07

问题


Good Morning,

I have the following a dataframe with two columns of integers and a Series (diff) computed as:

diff = (df["col_1"] - df["col_2"]) / (df["col_2"])

I would like to create a column of the dataframe whose values are:

  • equal to 0, if (diff >= 0) & (diff <= 0.35)
  • equal to 1, if (diff > 0.35)

  • equal to 2, if (diff < 0) & (diff >= - 0.35)

  • equal to 3, if (diff < - 0.35)

I tried with:

df["Class"] = np.where( (diff >= 0) &  (diff <= 0.35), 0, 
np.where( (diff > 0.35), 1, 
np.where( (diff  < 0) & (diff >=  - 0.35) ), 2, 
np.where( ((diff <  - 0.35), 3) ))) 

But it reports the following error:

SystemError: <built-in function where> returned a result with an error set          

How can I fix it?


回答1:


You can use numpy.select to specify conditions and values separately.

s = (df['col_1'] / df['col_2']) - 1

conditions = [s.between(0, 0.35), s > 0.35, s.between(-0.35, 0), s < -0.35]
values = [0, 1, 2, 3]

df['Class'] = np.select(conditions, values, np.nan)



回答2:


One can also simply use numpy.searchsorted:

diff_classes = [-0.35,0,0.35]
def getClass(x):
    return len(diff_classes)-np.searchsorted(diff_classes,x)

df["class"]=diff.apply(getClass)

searchsorted will give you the index of x in the diff_classes list, which you then substract from 3 to get your desired result.

edit: A little bit less readable, but it also works in one line:

df["class"] = diff.apply(lambda x: 3-np.searchsorted([-0.35,0,0.35],x))


来源:https://stackoverflow.com/questions/51301149/numpy-where-with-more-than-2-conditions

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!