Pandas Lambda Function with Nan Support

前端 未结 4 1003
南旧
南旧 2021-01-02 19:52

I am trying to write a lambda function in Pandas that checks to see if Col1 is a Nan and if so, uses another column\'s data. I have having trouble getting code (below) to c

相关标签:
4条回答
  • 2021-01-02 19:53

    Assuming that you do have a second column, that is:

    df = pd.DataFrame({ 'Col1' : [1,2,3,np.NaN], 'Col2': [1,2,3,4]})

    The correct solution to this problem would be:

    df['Col1'].fillna(df['Col2'], inplace=True)
    
    0 讨论(0)
  • 2021-01-02 20:03

    Within pandas 0.24.2, I use

    df.apply(lambda x: x['col_name'] if x[col1] is np.nan else expressions_another, axis=1)
    

    because pd.isnull() doesn't work.

    in my work,I found the following phenomenon,

    No running results:

    df['prop'] = df.apply(lambda x: (x['buynumpday'] / x['cnumpday']) if pd.isnull(x['cnumpday']) else np.nan, axis=1)
    

    Results exist:

    df['prop'] = df.apply(lambda x: (x['buynumpday'] / x['cnumpday']) if x['cnumpday'] is not np.nan else np.nan, axis=1)
    
    0 讨论(0)
  • 2021-01-02 20:06

    You need to use np.nan()

    #import numpy as np
    df2=df.apply(lambda x: 2 if np.isnan(x['Col1']) else 1, axis=1)   
    
    df2
    Out[1307]: 
    0    1
    1    1
    2    1
    3    2
    dtype: int64
    
    0 讨论(0)
  • 2021-01-02 20:14

    You need pandas.isnull for check if scalar is NaN:

    df = pd.DataFrame({ 'Col1' : [1,2,3,np.NaN],
                     'Col2' : [8,9,7,10]})  
    
    df2 = df.apply(lambda x: x['Col2'] if pd.isnull(x['Col1']) else x['Col1'], axis=1)
    
    print (df)
       Col1  Col2
    0   1.0     8
    1   2.0     9
    2   3.0     7
    3   NaN    10
    
    print (df2)
    0     1.0
    1     2.0
    2     3.0
    3    10.0
    dtype: float64
    

    But better is use Series.combine_first:

    df['Col1'] = df['Col1'].combine_first(df['Col2'])
    
    print (df)
       Col1  Col2
    0   1.0     8
    1   2.0     9
    2   3.0     7
    3  10.0    10
    

    Another solution with Series.update:

    df['Col1'].update(df['Col2'])
    print (df)
       Col1  Col2
    0   8.0     8
    1   9.0     9
    2   7.0     7
    3  10.0    10
    
    0 讨论(0)
提交回复
热议问题