Adding two Series with NaNs

前端 未结 3 1055
孤街浪徒
孤街浪徒 2020-11-29 10:22

I\'m working through the \"Python For Data Analysis\" and I don\'t understand a particular functionality. Adding two pandas series objects will automatically align the index

相关标签:
3条回答
  • 2020-11-29 10:31

    It makes more sense to use pd.concat() as it can accept more columns.

    import pandas as pd
    import numpy as np
    
    a = pd.Series([35000,71000,16000,5000],index=['Ohio','Texas','Oregon','Utah'])
    b = pd.Series([np.nan,71000,16000,35000],index=['California', 'Texas', 'Oregon', 'Ohio'])
    
    pd.concat((a,b), axis=1).sum(1, min_count=1)
    

    Output:

    California         NaN
    Ohio           70000.0
    Oregon         32000.0
    Texas         142000.0
    Utah            5000.0
    dtype: float64
    

    Or with 3 series:

    import pandas as pd
    import numpy as np
    
    a = pd.Series([1, np.NaN, 4, 5])
    b = pd.Series([3, np.NaN, 5, np.NaN])
    c = pd.Series([np.NaN,np.NaN,np.NaN,np.NaN])
    
    print(pd.concat((a,b,c), axis=1).sum(1, min_count=1))
    
    #0    4.0
    #1    NaN
    #2    9.0
    #3    5.0
    #dtype: float64
    
    0 讨论(0)
  • 2020-11-29 10:33

    Pandas does not assume that 500+NaN=500, but it is easy to ask it to do that: a.add(b, fill_value=0)

    0 讨论(0)
  • 2020-11-29 10:39

    The default approach is to assume that any computation involving NaN gives NaN as the result. Anything plus NaN is NaN, anything divided by NaN is NaN, etc. If you want to fill the NaN with some value, you have to do that explicitly (as Dan Allan showed in his answer).

    0 讨论(0)
提交回复
热议问题