Pandas DataFrame - assign 1,0 values based on other column

牧云@^-^@ 提交于 2019-12-31 02:34:06

问题


I've got a dataframe containing country names & their percentage of energy output. I need to add a new column that assigns a 1 or 0, based on whether the country's energy output is above or below the median of energy output. Some dummy code is:

import pandas as pd
def answer():
    df = pd.DataFrame({'name':['china', 'america', 'canada'], 'output': [33.2, 15.0, 5.0]})
    df['newcol'] = df.where(df['output'] > df['output'].median(), 1, 0)
    return df['newcol']
answer()

the code returns

ValueError: Wrong number of items passed 2, placement implies 1

I feel like this is an incredibly simple fix but I'm new to working with Pandas. Please help end my frustration


回答1:


@Vaishali explains why pd.DataFrame.where didn't work as you expected and suggested you use np.where instead, which is very good advice.

I'll offer up that you could have simply converted your boolean result to integers.

Setup

df = pd.DataFrame({
    'name':['china', 'america', 'canada'],
    'output': [33.2, 15.0, 5.0]
})

Option 1

df['newcol'] = (df['output'] > df['output'].median()).astype(int)

Option 2
Or faster yet by using the underlying numpy arrays

o = df['output'].values
df['newcol'] = (o > np.median(o)).astype(int)



回答2:


You don't need loop as the solution is vectorized.

df['newcol'] = np.where((df['output'] > df['output'].median()), 1, 0)

    name    output  newcol
0   china   33.2    1
1   america 15.0    0
2   canada  5.0     0

For the error wrong number of items passed, df.where works a little different from np.where. It Returns an object of same shape as self whose corresponding entries are from self where cond is True and otherwise are from other. So its returning a dataframe in your case with two columns instead of a series and hence when you try to assign that dataframe to a series, you get the error message.



来源:https://stackoverflow.com/questions/46230285/pandas-dataframe-assign-1-0-values-based-on-other-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!