Getting a sublist of a Python list, with the given indices?

后端 未结 8 1202
醉梦人生
醉梦人生 2020-12-08 09:50

I have a Python list, say a = [0,1,2,3,4,5,6]. I also have a list of indices, say b = [0,2,4,5]. How can I get the list of elements of a

相关标签:
8条回答
  • 2020-12-08 10:21

    If you are a fan of functional programming, you could use map and list.__getitem__:

    >>> a = [0,1,2,3,4,5,6]
    >>> b = [0,2,4,5]
    >>> map(a.__getitem__, b)
    [0, 2, 4, 5]
    >>>
    

    The list comprehension approach is more canonical in Python though...

    0 讨论(0)
  • 2020-12-08 10:21

    A bit of speed comparison for all mentioned methods and others from Python dictionary: Get list of values for list of keys:

    Python 2.7.11 |Anaconda 2.4.1 (64-bit)| (default, Jan 19 2016, 12:08:31) [MSC v.1500 64 bit (AMD64)] on win32
    
    In[2]: import numpy.random as nprnd
    idx = nprnd.randint(1000, size=10000)
    l = nprnd.rand(1000).tolist()
    from operator import itemgetter
    import operator
    f = operator.itemgetter(*idx)
    %timeit f(l)
    %timeit list(itemgetter(*idx)(l))
    %timeit [l[_] for _ in idx]  # list comprehension
    %timeit map(l.__getitem__, idx)
    %timeit list(l[_] for _ in idx)  # a generator expression passed to a list constructor.
    %timeit map(lambda _: l[_], idx)  # using 'map'
    %timeit [x for i, x in enumerate(l) if i in idx]
    %timeit filter(lambda x: l.index(x) in idx, l)  # UPDATE @Kundor: work only for list with unique elements
    10000 loops, best of 3: 175 µs per loop
    1000 loops, best of 3: 707 µs per loop
    1000 loops, best of 3: 978 µs per loop
    1000 loops, best of 3: 1.03 ms per loop
    1000 loops, best of 3: 1.18 ms per loop
    1000 loops, best of 3: 1.86 ms per loop
    100 loops, best of 3: 12.3 ms per loop
    10 loops, best of 3: 21.2 ms per loop
    

    So the fastest is f = operator.itemgetter(*idx); f(l)

    0 讨论(0)
  • 2020-12-08 10:28

    Using List Comprehension ,this should work -

    li = [a[i] for i in b]
    

    Testing this -

    >>> a = [0,10,20,30,40,50,60]
    >>> b = [0,2,4,5]
    >>> li = [a[i] for i in b]
    >>> li
    [0, 20, 40, 50]
    
    0 讨论(0)
  • 2020-12-08 10:28

    Using numpy.asarray. Numpy allows getting subarray of array by list of indices.

    >>> import numpy as np
    >>> a = [0,10,20,30,40,50,60]
    >>> b = [0,2,4,5]
    >>> res = np.asarray(a)[b].tolist()
    >>> res
    [0, 20, 40, 50]
    
    0 讨论(0)
  • 2020-12-08 10:30

    Something different...

    >>> a = range(7)
    >>> b = [0,2,4,5]
    >>> import operator
    >>> operator.itemgetter(*b)(a)
    (0, 2, 4, 5)
    

    The itemgetter function takes one or more keys as arguments, and returns a function which will return the items at the given keys in its argument. So in the above, we create a function which will return the items at index 0, index 2, index 4, and index 5, then apply that function to a.

    It appears to be quite a bit faster than the equivalent list comprehension

    In [1]: import operator
    
    In [2]: a = range(7)
    
    In [3]: b = [0,2,4,5]
    
    In [4]: %timeit operator.itemgetter(*b)(a)
    1000000 loops, best of 3: 388 ns per loop
    
    In [5]: %timeit [ a[i] for i in b ]
    1000000 loops, best of 3: 415 ns per loop
    
    In [6]: f = operator.itemgetter(*b)
    
    In [7]: %timeit f(a)
    10000000 loops, best of 3: 183 ns per loop
    

    As for why itemgetter is faster, the comprehension has to execute extra Python byte codes.

    In [3]: def f(a,b): return [a[i] for i in b]
    
    In [4]: def g(a,b): return operator.itemgetter(*b)(a)
    
    In [5]: dis.dis(f)
      1           0 BUILD_LIST               0
                  3 LOAD_FAST                1 (b)
                  6 GET_ITER
            >>    7 FOR_ITER                16 (to 26)
                 10 STORE_FAST               2 (i)
                 13 LOAD_FAST                0 (a)
                 16 LOAD_FAST                2 (i)
                 19 BINARY_SUBSCR
                 20 LIST_APPEND              2
                 23 JUMP_ABSOLUTE            7
            >>   26 RETURN_VALUE
    

    While itemgetter is a single call implemented in C:

    In [6]: dis.dis(g)
      1           0 LOAD_GLOBAL              0 (operator)
                  3 LOAD_ATTR                1 (itemgetter)
                  6 LOAD_FAST                1 (b)
                  9 CALL_FUNCTION_VAR        0
                 12 LOAD_FAST                0 (a)
                 15 CALL_FUNCTION            1
                 18 RETURN_VALUE
    
    0 讨论(0)
  • 2020-12-08 10:36

    Many of the proposed solutions will produce a KeyError if b contains an index not present in a. The following will skip invalid indices if that is desired.

    >>> b = [0,2,4,5]
    >>> a = [0,1,2,3,4,5,6]
    >>> [x for i,x in enumerate(a) if i in b]
    [0, 2, 4, 5]
    >>> b = [0,2,4,500]
    >>> [x for i,x in enumerate(a) if i in b]
    [0, 2, 4]
    

    enumerate produces tuples of index,value pairs. Since we have both the item and its index, we can check for the presence of the index in b

    0 讨论(0)
提交回复
热议问题