numpy corrcoef - compute correlation matrix while ignoring missing data

前端 未结 3 1842
故里飘歌
故里飘歌 2020-12-29 06:15

I am trying to compute a correlation matrix of several values. These values include some \'nan\' values. I\'m using numpy.corrcoef. For element(i,j) of the output correla

3条回答
  •  太阳男子
    2020-12-29 06:50

    This will work, using the masked array numpy module:

    import numpy as np
    import numpy.ma as ma
    
    A = [1, 2, 3, 4, 5, np.NaN]
    B = [2, 3, 4, 5.25, np.NaN, 100]
    
    print(ma.corrcoef(ma.masked_invalid(A), ma.masked_invalid(B)))
    

    It outputs:

    [[1.0 0.99838143945703]
     [0.99838143945703 1.0]]
    

    Read more here: https://docs.scipy.org/doc/numpy/reference/maskedarray.generic.html

提交回复
热议问题