问题
How to add/insert output of a function call that returns multiple fields, as new columns into Pandas dataframe ?
Sample code & data:
from pandas import DataFrame
People_List = [['Jon','Smith',21],['Mark','Brown',38],['Maria','Lee',42],['Jill','Jones',28],['Jack','Ford',55]]
df = DataFrame (People_List,columns=['First_Name','Last_Name','Age'])
print (df)
First_Name Last_Name Age
0 Jon Smith 21
1 Mark Brown 38
2 Maria Lee 42
3 Jill Jones 28
4 Jack Ford 55
def getTitleBirthYear(df):
if 'Maria' in df.First_Name:
title='Ms'
else:
title='Mr'
current_year = int('2020')
birth_year=''
age = df.Age
birth_year = current_year - age
return title,birth_year
getTitleBirthYear(df)
title birth_year
0 Mr 1999
1 Mr 1982
2 Ms 1978
3 Mr 1992
4 Mr 1965
final expected output:
First_Name Last_Name Age title birth_year
0 Jon Smith 21 Mr 1999
1 Mark Brown 38 Mr 1982
2 Maria Lee 42 Ms 1978
3 Jill Jones 28 Mr 1992
4 Jack Ford 55 Mr 1965
Please suggest. Thanks!
回答1:
Here are two ways, apply and create the new columns
df[['title', 'birth_year']] = pd.DataFrame(df.apply(getTitleBirthYear, axis=1).tolist())
df[['title', 'birth_year']] = df.apply(getTitleBirthYear, axis=1, result_type='expand')
First_Name Last_Name Age title birth_year
0 Jon Smith 21 Mr 1999
1 Mark Brown 38 Mr 1982
2 Maria Lee 42 Ms 1978
3 Jill Jones 28 Mr 1992
4 Jack Ford 55 Mr 1965
回答2:
Although you can apply
, best is to use vectorized functions (see When should I (not) want to use pandas apply() in my code?). Your logic can be simplified as below:
print (df.assign(title=np.where(df["First_Name"].eq("Maria"), "Ms", "Mr"),
birth_year=pd.Timestamp.now().year-df["Age"])) # or 2020-df["Age"]
First_Name Last_Name Age title birth_year
0 Jon Smith 21 Mr 1999
1 Mark Brown 38 Mr 1982
2 Maria Lee 42 Ms 1978
3 Jill Jones 28 Mr 1992
4 Jack Ford 55 Mr 1965
来源:https://stackoverflow.com/questions/65110840/how-to-add-insert-output-of-a-function-call-that-returns-multiple-fields-as-new