I am experimenting with the numpy.where(condition[, x, y]) function.
From the numpy documentation, I learn that if you give just one array as input, it shou
In Python (1) means just 1. () can be freely added to group numbers and expressions for human readability (e.g. (1+3)*3 v (1+3,)*3). Thus to denote a 1 element tuple it uses (1,) (and requires you to use it as well).
Thus
(array([4, 5, 6, 7, 8]),)
is a one element tuple, that element being an array.
If you applied where to a 2d array, the result would be a 2 element tuple.
The result of where is such that it can be plugged directly into an indexing slot, e.g.
a[where(a>0)]
a[a>0]
should return the same things
as would
I,J = where(a>0) # a is 2d
a[I,J]
a[(I,J)]
Or with your example:
In [278]: a=np.array([1,2,3,4,5,6,7,8,9])
In [279]: np.where(a>4)
Out[279]: (array([4, 5, 6, 7, 8], dtype=int32),) # tuple
In [280]: a[np.where(a>4)]
Out[280]: array([5, 6, 7, 8, 9])
In [281]: I=np.where(a>4)
In [282]: I
Out[282]: (array([4, 5, 6, 7, 8], dtype=int32),)
In [283]: a[I]
Out[283]: array([5, 6, 7, 8, 9])
In [286]: i, = np.where(a>4) # note the , on LHS
In [287]: i
Out[287]: array([4, 5, 6, 7, 8], dtype=int32) # not tuple
In [288]: a[i]
Out[288]: array([5, 6, 7, 8, 9])
In [289]: a[(i,)]
Out[289]: array([5, 6, 7, 8, 9])
======================
np.flatnonzero shows the correct way of returning just one array, regardless of the dimensions of the input array.
In [299]: np.flatnonzero(a>4)
Out[299]: array([4, 5, 6, 7, 8], dtype=int32)
In [300]: np.flatnonzero(a>4)+10
Out[300]: array([14, 15, 16, 17, 18], dtype=int32)
It's doc says:
This is equivalent to a.ravel().nonzero()[0]
In fact that is literally what the function does.
By flattening a removes the question of what to do with multiple dimensions. And then it takes the response out of the tuple, giving you a plain array. With flattening it doesn't have make a special case for 1d arrays.
===========================
@Divakar suggests np.argwhere:
In [303]: np.argwhere(a>4)
Out[303]:
array([[4],
[5],
[6],
[7],
[8]], dtype=int32)
which does np.transpose(np.where(a>4))
Or if you don't like the column vector, you could transpose it again
In [307]: np.argwhere(a>4).T
Out[307]: array([[4, 5, 6, 7, 8]], dtype=int32)
except now it is a 1xn array.
We could just as well have wrapped where in array:
In [311]: np.array(np.where(a>4))
Out[311]: array([[4, 5, 6, 7, 8]], dtype=int32)
Lots of ways of taking an array out the where tuple ([0], i,=, transpose, array, etc).
Just use np.asarray function. In your case:
>>> import numpy as np
>>> array = np.array([1,2,3,4,5,6,7,8,9])
>>> pippo = np.asarray(np.where(array>4))
>>> pippo + 1
array([[5, 6, 7, 8, 9]])
Short answer: np.where is designed to have consistent output regardless of the dimension of the array.
A two-dimensional array has two indices, so the result of np.where is a length-2 tuple containing the relevant indices. This generalizes to a length-3 tuple for 3-dimensions, a length-4 tuple for 4 dimensions, or a length-N tuple for N dimensions. By this rule, it is clear that in 1 dimension, the result should be a length-1 tuple.