Advanced slicing when passed list instead of tuple in numpy

后端 未结 2 1128
萌比男神i
萌比男神i 2020-12-10 18:01

In the docs, it says (emphasis mine):

Advanced indexing is triggered when the selection object, obj, is a non-tuple sequence object

相关标签:
2条回答
  • 2020-12-10 18:44

    With a dummy class I can determine how the interpreter translates [...] into calls to __getitem__.

    In [1073]: class Foo():
          ...:     def __getitem__(idx):
          ...:         print(idx)
    In [1080]: Foo()[1,2,slice(None)]
    (1, 2, slice(None, None, None))
    In [1081]: Foo()[(1,2,slice(None))]
    (1, 2, slice(None, None, None))
    In [1082]: Foo()[[1,2,slice(None)]]
    [1, 2, slice(None, None, None)]
    

    So wrapping multiple terms with () makes no difference - it gets a tuple in both cases. And a list is passed as a list.

    So the distinction between tuple and list (or not) must coded in numpy source code - which is compiled. So I can't readily study it.

    With a 1d array

    indexing with a list produces the advanced indexing - picking specific values:

    In [1085]: arr[[1,2,3]]
    Out[1085]: array([ 0.73703368,  0.        ,  0.        ])
    

    but replacing one of those values with a tuple, or a slice:

    In [1086]: arr[[1,2,(2,3)]]
    IndexError: too many indices for array
    
    In [1088]: arr[[1,2,slice(None)]] 
    IndexError: too many indices for array
    

    and the list is treated as a tuple - it tries matching values with dimensions.

    So at a top level a list and tuple are treated the same - if the list can't interpreted as an advanced indexing list.

    Notice also a difference which single item lists

    In [1089]: arr[[1]]
    Out[1089]: array([ 0.73703368])
    In [1090]: arr[(1,)]
    Out[1090]: 0.73703367969998546
    In [1091]: arr[1]
    Out[1091]: 0.73703367969998546
    

    Some functions like np.apply_along/over_axis generate an index as list or array, and then apply it. They work with a list or array because it is mutable. Some then wrap it in tuple before use as index; others didn't bother. That difference sort of bothered me, but these test case indicate that such a tuple wrapped often is optional.

    In [1092]: idx=[1,2,slice(None)]
    In [1093]: np.ones((2,3,4))[idx]
    Out[1093]: array([ 1.,  1.,  1.,  1.])
    In [1094]: np.ones((2,3,4))[tuple(idx)]
    Out[1094]: array([ 1.,  1.,  1.,  1.])
    

    Looks like the tuple wrapper is still needed if I build the index as an object array:

    In [1096]: np.ones((2,3,4))[np.array(idx)]
    ...
    IndexError: arrays used as indices must be of integer (or boolean) type
    In [1097]: np.ones((2,3,4))[tuple(np.array(idx))]
    Out[1097]: array([ 1.,  1.,  1.,  1.])
    

    ===================

    Comment from the function @Eric linked

        /*
         * Sequences < NPY_MAXDIMS with any slice objects
         * or newaxis, Ellipsis or other arrays or sequences
         * embedded, are considered equivalent to an indexing
         * tuple. (`a[[[1,2], [3,4]]] == a[[1,2], [3,4]]`)
         */
    

    ===================

    This function wraps object arrays and lists in tuple for indexing:

    def apply_along_axis(func1d, axis, arr, *args, **kwargs):
         ....
         ind = [0]*(nd-1)
         i = zeros(nd, 'O')
         ....
         res = func1d(arr[tuple(i.tolist())], *args, **kwargs)
         outarr[tuple(ind)] = res
    
    0 讨论(0)
  • 2020-12-10 18:52

    There's an exception to that rule. The Advanced Indexing documentation section doesn't mention it, but up above, near the start of the Basic Slicing and Indexing section, you'll see the following text:

    In order to remain backward compatible with a common usage in Numeric, basic slicing is also initiated if the selection object is any non-ndarray sequence (such as a list) containing slice objects, the Ellipsis object, or the newaxis object, but not for integer arrays or other embedded sequences.


    a[[1, np.array(2)]] doesn't quite trigger basic indexing. It triggers an undocumented part of the backward compatibility logic, as described in a comment in the source code:

        /*
         * Sequences < NPY_MAXDIMS with any slice objects
         * or newaxis, Ellipsis or other arrays or sequences
         * embedded, are considered equivalent to an indexing
         * tuple. (`a[[[1,2], [3,4]]] == a[[1,2], [3,4]]`)
         */
    

    The np.array(2) inside the list causes the list to be treated as if it were a tuple, but the result, a[(1, np.array(2))], is still an advanced indexing operation. It ends up applying the 1 and the 2 to separate axes, unlike a[[1, 2]], and the result ends up looking identical to a[1, 2], but if you try it with a 3D a, it produces a copy instead of a view.

    0 讨论(0)
提交回复
热议问题