问题
I want to make a calculation based on 4 columns in a dataframe and apply the result to a new column.
The 4 columns I'm interested in are as follows.
rating_1, time_1, rating_2, time_2 col_x col_y etc
0 1 1 1 1 1 1
If time_1 is greater than time_2 I want rating_1 in the new column, if time_2 is greater I want rating_2 in the column.
What's the simplest way to do this please?
回答1:
you can use numpy.where() method:
In [241]: x
Out[241]:
rating_1 time_1 rating_2 time_2 col_x col_y
0 11 1 21 1 1 1
1 12 2 21 1 1 1
2 13 1 21 5 1 1
3 14 5 21 5 1 1
In [242]: x['new'] = np.where(x.time_1 > x.time_2, x.rating_1, x.rating_2)
In [243]: x
Out[243]:
rating_1 time_1 rating_2 time_2 col_x col_y new
0 11 1 21 1 1 1 21
1 12 2 21 1 1 1 12
2 13 1 21 5 1 1 21
3 14 5 21 5 1 1 21
回答2:
def myfunc(row):
if row.time_1 >= row.time_2:
return row.rating_1
else:
return row.rating_2
df.loc[:, 'calculatedColumn'] = df.apply(myfunc, axis = 1)
来源:https://stackoverflow.com/questions/40222181/pandas-dataframe-create-new-column-based-on-simple-calcuation