How is numpy's fancy indexing implemented?

前端 未结 3 2067
别那么骄傲
别那么骄傲 2020-12-08 08:10

I was doing a little experimentation with 2D lists and numpy arrays. From this, I\'ve raised 3 questions I\'m quite curious to know the answer for.

First, I initiali

3条回答
  •  渐次进展
    2020-12-08 08:25

    my_list[:,] is translated by the interpreter into

    my_list.__getitem__((slice(None, None, None),))
    

    It's like calling a function with *args, but it takes care of translating the : notation into a slice object. Without the , it would just pass the slice. With the , it passes a tuple.

    The list __getitem__ does not accept a tuple, as shown by the error. An array __getitem__ does. I believe the ability to pass a tuple and create slice objects was added as convenience for numpy (or its predicessors). The tuple notation has never been added to the list __getitem__. (There is an operator.itemgetter class that allows a form of advanced indexing, but internally it is just a Python code iterator.)

    With an array you can use the tuple notation directly:

    In [490]: np.arange(6).reshape((2,3))[:,[0,1]]
    Out[490]: 
    array([[0, 1],
           [3, 4]])
    In [491]: np.arange(6).reshape((2,3))[(slice(None),[0,1])]
    Out[491]: 
    array([[0, 1],
           [3, 4]])
    In [492]: np.arange(6).reshape((2,3)).__getitem__((slice(None),[0,1]))
    Out[492]: 
    array([[0, 1],
           [3, 4]])
    

    Look at the numpy/lib/index_tricks.py file for example of fun stuff you can do with __getitem__. You can view the file with

    np.source(np.lib.index_tricks)
    

    A nested list is a list of lists:

    In a nested list, the sublists are independent of the containing list. The container just has pointers to objects elsewhere in memory:

    In [494]: my_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
    In [495]: my_list
    Out[495]: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
    In [496]: len(my_list)
    Out[496]: 3
    In [497]: my_list[1]
    Out[497]: [4, 5, 6]
    In [498]: type(my_list[1])
    Out[498]: list
    In [499]: my_list[1]='astring'
    In [500]: my_list
    Out[500]: [[1, 2, 3], 'astring', [7, 8, 9]]
    

    Here I change the 2nd item of my_list; it is no longer a list, but a string.

    If I apply [:] to a list I just get a shallow copy:

    In [501]: xlist = my_list[:]
    In [502]: xlist[1] = 43
    In [503]: my_list           # didn't change my_list
    Out[503]: [[1, 2, 3], 'astring', [7, 8, 9]]
    In [504]: xlist
    Out[504]: [[1, 2, 3], 43, [7, 8, 9]]
    

    but changing an element of a list in xlist does change the corresponding sublist in my_list:

    In [505]: xlist[0][1]=43
    In [506]: my_list
    Out[506]: [[1, 43, 3], 'astring', [7, 8, 9]]
    

    To me this shows by n-dimensional indexing (as implemented for numpy arrays) doesn't make sense with nested lists. Nested lists are multidimensional only to the extent that their contents allow; there's nothing structural or syntactically multidimensional about them.

    the timings

    Using two [:] on a list does not make a deep copy or work its way down the nesting. It just repeats the shallow copy step:

    In [507]: ylist=my_list[:][:]
    In [508]: ylist[0][1]='boo'
    In [509]: xlist
    Out[509]: [[1, 'boo', 3], 43, [7, 8, 9]]
    

    arr[:,] just makes a view of arr. The difference between view and copy is part of understanding the difference between basic and advanced indexing.

    So alist[:][:] and arr[:,] are different, but basic ways of making some sort of copy of lists and arrays. Neither computes anything, and neither iterates through the elements. So a timing comparison doesn't tell us much.

提交回复
热议问题