consider the array a
a = np.array([3, 3, np.nan, 3, 3, np.nan])
I could do
np.isnan(a).argmax()
Here is a pythonic approach using itertools.takewhile()
:
from itertools import takewhile
sum(1 for _ in takewhile(np.isfinite, a))
Benchmark with generator_expression_within_next
approach: 1
In [118]: a = np.repeat(a, 10000)
In [120]: %timeit next(i for i, j in enumerate(a) if np.isnan(j))
100 loops, best of 3: 12.4 ms per loop
In [121]: %timeit sum(1 for _ in takewhile(np.isfinite, a))
100 loops, best of 3: 11.5 ms per loop
But still (by far) slower than numpy approach:
In [119]: %timeit np.isnan(a).argmax()
100000 loops, best of 3: 16.8 µs per loop
1. The problem with this approach is using enumerate
function. Which returns an enumerate
object from the numpy array first (which is an iterator like object) and calling the generator function and next
attribute of the iterator will take time.