How to add/insert output of a function call that returns multiple fields, as new columns into Pandas dataframe?

江枫思渺然 提交于 2020-12-13 03:03:25

问题


How to add/insert output of a function call that returns multiple fields, as new columns into Pandas dataframe ?

Sample code & data:

from pandas import DataFrame
People_List = [['Jon','Smith',21],['Mark','Brown',38],['Maria','Lee',42],['Jill','Jones',28],['Jack','Ford',55]]
df = DataFrame (People_List,columns=['First_Name','Last_Name','Age'])
print (df)


  First_Name Last_Name  Age
0        Jon     Smith   21
1       Mark     Brown   38
2      Maria       Lee   42
3       Jill     Jones   28
4       Jack      Ford   55


def getTitleBirthYear(df):
    if 'Maria' in df.First_Name:
        title='Ms'
    else:
        title='Mr' 
    current_year = int('2020')
    birth_year=''
    age = df.Age
    birth_year = current_year - age
    return title,birth_year

getTitleBirthYear(df)

  title birth_year
0 Mr    1999
1 Mr    1982
2 Ms    1978
3 Mr    1992
4 Mr    1965

final expected output:

  First_Name Last_Name  Age title   birth_year
0        Jon     Smith   21 Mr      1999
1       Mark     Brown   38 Mr      1982
2      Maria       Lee   42 Ms      1978
3       Jill     Jones   28 Mr      1992
4       Jack      Ford   55 Mr      1965

Please suggest. Thanks!


回答1:


Here are two ways, apply and create the new columns

df[['title', 'birth_year']] = pd.DataFrame(df.apply(getTitleBirthYear, axis=1).tolist())

df[['title', 'birth_year']] = df.apply(getTitleBirthYear, axis=1, result_type='expand')

  First_Name Last_Name  Age title  birth_year
0        Jon     Smith   21    Mr        1999
1       Mark     Brown   38    Mr        1982
2      Maria       Lee   42    Ms        1978
3       Jill     Jones   28    Mr        1992
4       Jack      Ford   55    Mr        1965



回答2:


Although you can apply, best is to use vectorized functions (see When should I (not) want to use pandas apply() in my code?). Your logic can be simplified as below:

print (df.assign(title=np.where(df["First_Name"].eq("Maria"), "Ms", "Mr"),
                 birth_year=pd.Timestamp.now().year-df["Age"])) # or 2020-df["Age"]

  First_Name Last_Name  Age title  birth_year
0        Jon     Smith   21    Mr        1999
1       Mark     Brown   38    Mr        1982
2      Maria       Lee   42    Ms        1978
3       Jill     Jones   28    Mr        1992
4       Jack      Ford   55    Mr        1965


来源:https://stackoverflow.com/questions/65110840/how-to-add-insert-output-of-a-function-call-that-returns-multiple-fields-as-new

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!