inequality comparison of numpy array with nan to a scalar

こ雲淡風輕ζ 提交于 2019-11-30 08:06:10

问题


I am trying to set members of an array that are below a threshold to nan. This is part of a QA/QC process and the incoming data may already have slots that are nan.

So as an example my threshold might be -1000 and hence I would want to set -3000 to nan in the following array

x = np.array([np.nan,1.,2.,-3000.,np.nan,5.])

This following:

x[x < -1000.] = np.nan

produces the correct behavior, but also a RuntimeWarning, but the overhead of disabling the warning

warnings.filterwarnings("ignore")
...
warnints.resetwarnings()

is kind of heavy an potentially a bit unsafe.

Trying to index twice with fancy indexing as follows doesn't produce any effect:

nonan = np.where(~np.isnan(x))[0]
x[nonan][x[nonan] < -1000.] = np.nan

I assume this is because a copy is made due to the integer index or the use of indexing twice.

Does anyone have a relatively simple solution? It would be fine to use a masked array in the process, but the final product has to be an ndarray and I can't introduce new dependencies. Thanks.


回答1:


Any comparison (other than !=) of a NaN to a non-NaN value will always return False:

>>> x < -1000
array([False, False, False,  True, False, False], dtype=bool)

So you can simply ignore the fact that there are NaNs already in your array and do:

>>> x[x < -1000] = np.nan
>>> x
array([ nan,   1.,   2.,  nan,  nan,   5.])

EDIT I don't see any warning when I ran the above, but if you really need to stay away from the NaNs, you can do something like:

mask = ~np.isnan(x)
mask[mask] &= x[mask] < -1000
x[mask] = np.nan



回答2:


One option is to disable the relevant warnings with numpy.errstate:

with numpy.errstate(invalid='ignore'):
    ...

To turn off the relevant warnings globally, use numpy.seterr.




回答3:


np.less() has a where argument that controls where the operation will be applied. So you could do:

x[np.less(x, -1000., where=~np.isnan(x))] = np.nan



回答4:


I personally ignore the warnings using the np.errstate context manager in the answer already given, as the code clarity is worth the extra time, but here is an alternative.

# given
x = np.array([np.nan, 1., 2., -3000., np.nan, 5.])

# apply NaNs as desired
mask = np.zeros(x.shape, dtype=bool)
np.less(x, -1000, out=mask, where=~np.isnan(x))
x[mask] = np.nan

# expected output and comparison
y = np.array([np.nan, 1., 2., np.nan, np.nan, 5.])
assert np.allclose(x, y, rtol=0., atol=1e-14, equal_nan=True)

The numpy less ufunc takes the optional argument where, and only evaluates it where true, unlike the np.where function which evaluates both options and then picks the relevant one. You then set the desired output when it's not true by using the out argument.




回答5:


A little bit late, but this is how I would do:

x = np.array([np.nan,1.,2.,-3000.,np.nan,5.]) 

igood=np.where(~np.isnan(x))[0]
x[igood[x[igood]<-1000.]]=np.nan


来源:https://stackoverflow.com/questions/25345843/inequality-comparison-of-numpy-array-with-nan-to-a-scalar

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!