numpy corrcoef - compute correlation matrix while ignoring missing data

前端未结

关注

 3  1842

故里飘歌 2020-12-29 06:15

I am trying to compute a correlation matrix of several values. These values include some \'nan\' values. I\'m using numpy.corrcoef. For element(i,j) of the output correla

3条回答

太阳男子 (楼主)

2020-12-29 06:50
This will work, using the masked array numpy module:
```
import numpy as np
import numpy.ma as ma

A = [1, 2, 3, 4, 5, np.NaN]
B = [2, 3, 4, 5.25, np.NaN, 100]

print(ma.corrcoef(ma.masked_invalid(A), ma.masked_invalid(B)))
```
It outputs:
```
[[1.0 0.99838143945703]
 [0.99838143945703 1.0]]
```
Read more here: https://docs.scipy.org/doc/numpy/reference/maskedarray.generic.html
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...