python: Combined masking in numpy

问题

In a numpy array I want to replace all nan and inf into a fixed number. Can I do that in one step to save computation time (arrays are really big)?

a = np.arange(10.0)
a[3] = np.nan
a[5] = np.inf
a[7] = -np.inf
# a: [  0.   1.   2.  nan   4.  inf   6. -inf   8.   9.]

a[np.isnan(a)] = -999
a[np.isinf(a)] = -999
# a: [  0.   1.   2.  -999.   4.  -999.   6. -999.   8.   9.]

The code above works fine. But I am looking for something like:

a[np.isnan(a) or np.isinf(a)] = -999

Which does not work and I can see why. Just thinking it might be better if every item of a is only checked once.

回答1:

this seems to work:

a[np.isnan(a) | np.isinf(a)] = 2

np.isnan() and np.isinf() in fact return two boolean numpy arrays.

boolean numpy arrays can be combined with bitwise operations such as & and |

回答2:

Numpy comes with its own vectorized version of or:

a[np.logical_or(np.isnan(a), np.isinf(a))] = -999

While the above version is clear understanable, there is a faster one, which is a bit weird:

a[np.isnan(a-a)] = -9999

The idea behind this is, that 'np.inf-np.inf = np.nan`

%timeit a[np.isnan(a-a)] = -999
# 100000 loops, best of 3: 11.7 µs per loop
%timeit a[np.isnan(a) | np.isinf(a)] = -999
# 10000 loops, best of 3: 51.4 µs per loop
%timeit a[np.logical_or(np.isnan(a), np.isinf(a))] = -999
# 10000 loops, best of 3: 51.4 µs per loop

Hence the | and np.logical_or version seem to be internally equivalent

回答3:

You could use np.isfinite which verifies that a number is not infinite nor a NaN:

a[~np.isfinite(a)] = -999

来源：https://stackoverflow.com/questions/45614447/python-combined-masking-in-numpy

标签

python

numpy

nan