pandas create new column based on values from other columns / apply a function of multiple columns, row-wise

后端 未结 5 613
广开言路
广开言路 2020-11-22 06:24

I want to apply my custom function (it uses an if-else ladder) to these six columns (ERI_Hispanic, ERI_AmerInd_AKNatv, ERI_Asian,

5条回答
  •  谎友^
    谎友^ (楼主)
    2020-11-22 06:30

    OK, two steps to this - first is to write a function that does the translation you want - I've put an example together based on your pseudo-code:

    def label_race (row):
       if row['eri_hispanic'] == 1 :
          return 'Hispanic'
       if row['eri_afr_amer'] + row['eri_asian'] + row['eri_hawaiian'] + row['eri_nat_amer'] + row['eri_white'] > 1 :
          return 'Two Or More'
       if row['eri_nat_amer'] == 1 :
          return 'A/I AK Native'
       if row['eri_asian'] == 1:
          return 'Asian'
       if row['eri_afr_amer']  == 1:
          return 'Black/AA'
       if row['eri_hawaiian'] == 1:
          return 'Haw/Pac Isl.'
       if row['eri_white'] == 1:
          return 'White'
       return 'Other'
    

    You may want to go over this, but it seems to do the trick - notice that the parameter going into the function is considered to be a Series object labelled "row".

    Next, use the apply function in pandas to apply the function - e.g.

    df.apply (lambda row: label_race(row), axis=1)
    

    Note the axis=1 specifier, that means that the application is done at a row, rather than a column level. The results are here:

    0           White
    1        Hispanic
    2           White
    3           White
    4           Other
    5           White
    6     Two Or More
    7           White
    8    Haw/Pac Isl.
    9           White
    

    If you're happy with those results, then run it again, saving the results into a new column in your original dataframe.

    df['race_label'] = df.apply (lambda row: label_race(row), axis=1)
    

    The resultant dataframe looks like this (scroll to the right to see the new column):

          lname   fname rno_cd  eri_afr_amer  eri_asian  eri_hawaiian   eri_hispanic  eri_nat_amer  eri_white rno_defined    race_label
    0      MOST    JEFF      E             0          0             0              0             0          1       White         White
    1    CRUISE     TOM      E             0          0             0              1             0          0       White      Hispanic
    2      DEPP  JOHNNY    NaN             0          0             0              0             0          1     Unknown         White
    3     DICAP     LEO    NaN             0          0             0              0             0          1     Unknown         White
    4    BRANDO  MARLON      E             0          0             0              0             0          0       White         Other
    5     HANKS     TOM    NaN             0          0             0              0             0          1     Unknown         White
    6    DENIRO  ROBERT      E             0          1             0              0             0          1       White   Two Or More
    7    PACINO      AL      E             0          0             0              0             0          1       White         White
    8  WILLIAMS   ROBIN      E             0          0             1              0             0          0       White  Haw/Pac Isl.
    9  EASTWOOD   CLINT      E             0          0             0              0             0          1       White         White
    

提交回复
热议问题