I have a pandas dataframe that has 2 columns. I want to loop through it\'s rows and based on a string from column 2 I would like to add a string in a newly created 3th colum
I think you can use double numpy.where, what is faster as loop:
df['Column3'] = np.where(df['Column2']==variable1, variable2,
np.where(df['Column2']==variable3, variable4))
And if need add variable if both conditions are False
:
df['Column3'] = np.where(df['Column2']==variable1, variable2,
np.where(df['Column2']==variable3, variable4, variable5))
Sample:
df = pd.DataFrame({'Column2':[1,2,4,3]})
print (df)
Column2
0 1
1 2
2 4
3 3
variable1 = 1
variable2 = 2
variable3 = 3
variable4 = 4
variable5 = 5
df['Column3'] = np.where(df['Column2']==variable1, variable2,
np.where(df['Column2']==variable3, variable4, variable5))
print (df)
Column2 Column3
0 1 2
1 2 5
2 4 5
3 3 4
Another solution, thanks Jon Clements:
df['Column4'] = df.Column2.map({variable1: variable2, variable3:variable4}).fillna(variable5)
print (df)
Column2 Column3 Column4
0 1 2 2.0
1 2 5 5.0
2 4 5 5.0
3 3 4 4.0