Python: What's the fastest way to zip right to left, and is there no builtin for this?

陌路散爱 提交于 2021-02-08 13:14:37

问题


Given 2 sequences of different lengths:

In [931]: a = [1,2,3]
In [932]: b = [4,5,6,7]

This is what i want

In [933]: c = zip(reversed(a),reversed(b))
In [934]: [x for x in reversed(c)]
Out[934]: [(1, 5), (2, 6), (3, 7)]

But I don't like the idea of using reversed on all my input parameters, and I also don't want to re-implement my own zip function.

So:

  1. Is there a faster/more efficient way to do this?
  2. Is there a simpler way to do this?

回答1:


The following is faster and clearer than the other posted solutions:

s = zip(reversed(a), reversed(b))
s.reverse()                       # in-place

The reversed built-in creates an iterator those loops backwards at the same speed as forward iteration. No additional memory (full copies of the input) are produced and there are no slow trips around the Python eval-loop (present in the other solutions that use indicies).




回答2:


I would suggest:

>>> a = [1,2,3]
>>> b = [4,5,6,7]
>>> k = min(len(a),len(b))
>>> zip(a[-k:], b[-k:])
[(1, 5), (2, 6), (3, 7)]



回答3:


The following will work even if you don't know which is longer:

>>> zip(a[-len(b):], b[-len(a):])
[(1, 5), (2, 6), (3, 7)]

Here is an example of what happens with the smaller list:

>>> range(5)[-10:]
[0, 1, 2, 3, 4]

So even though the slice looks kind of funky, if you have a negative start value that is larger than the length of the list, it will give you the entire list.




回答4:


In the following solution, no copy of any of the input lists is done.

def rzip(a,b):
    m = min(len(a),len(b))
    for i in xrange(m):
        yield a[-m+i],b[-m+i]


for a,b in (([1,2,3,4,5,6,7],[8,9,10]),
            ([1,2,3],[8,9,10]),
            ([1,2,3,4],[1,2,3,4,5,6,7,8,9,10])):
    print list(rzip(a,b))

m being the length of the shortest list, -m+0 is always transformed into index 0 for this shortest list,
while -m+0 is transfomed in an index >= 0 for the longest list

result:

[(5, 8), (6, 9), (7, 10)]
[(1, 8), (2, 9), (3, 10)]
[(1, 7), (2, 8), (3, 9), (4, 10)]

Despite the fact the function rzip() isn't a one-liner, I think it is simple and I bet it is the faster solution

.

Edit

At first, in the function rzip() above (and eyquem1 in the following code) , I thought that creating the list of zipped elements by runing from correct position in each list from left to right would be a good procedure too. In the following code, eyquem1bis is the function rzip() slightly improved by a pre-calculation of the starting positions.

But I was interested by the Raymond Hettinger's claim that the exclusive use of zip() and iterator reversed() is the clearer and faster solution. So I imagined that using other iterators from left to right would be better than my function rzip() and I used iter() and islice() to write functions eyquem2 and eyquem3 in the following code.

I compared the execution's times of the diverse solutions proposed in most of the answers.

import random
from time import clock

lima,limb = 20000 , 50000

a = list(range(lima))
b = list(range(limb))
random.shuffle(a)
random.shuffle(b)


A,B,C,D,E,F,G,H,I = [],[],[],[],[],[],[],[],[],

n = 10
for essays in range(100):

    te = clock()
    for rep in range(n):
        k = min(len(a),len(b)) 
        sam1 = list(zip(a[-k:], b[-k:]))
    tref= (clock()-te)/100
    A.append((clock()-te,
              'Sam Hocevar, deducted normal slicing and zip'
              '\nk = min(len(a),len(b))'
              '\nlist(zip(a[-k:], b[-k:]))'))


    te = clock()
    for rep in range(n):
        k = min(len(a),len(b)) 
        sam2 = list(zip(a[len(a)-k:], b[len(b)-k:]))
    tref= (clock()-te)/100
    B.append((clock()-te,
              'Sam Hocevar, exactly normal slicing and zip'
              '\nk = min(len(a),len(b))'
              '\nlist(zip(a[len(a)-k:], b[len(b)-k:]))'))


    te = clock()
    for rep in range(n):
        fj = list(zip(a[-len(b):], b[-len(a):]))
    C.append((clock()-te,
              'F.J. , deducted tricky slicing and zip'
              '\nlist(zip(a[-len(b):], b[-len(a):]))'))


    te = clock()
    for rep in range(n):
        m = min(len(a),len(b))
        sleep = list(zip(a[len(a)-m:],b[len(b)-m:]))
    D.append((clock()-te,
              'sleeplessnerd, exactly normal slicing and zip'
              '\nm = min(len(a),len(b))'
              '\nlist(zip(a[len(a)-m:],b[len(b)-m:]))'))


    te = clock()
    for rep in range(n):
        m = min(len(a),len(b))
        ey1 = [ (a[-m+i],b[-m+i]) for i in range(m) ]
    E.append((clock()-te,
              'eyquem 1, deducted normal slicing and listcomp'
              '\nm = min(len(a),len(b))'
              '\n[ (a[-m+i],b[-m+i]) for i in range(m) ]'))


    te = clock()
    for rep in range(n):
        m = min(len(a),len(b))
        x,y = len(a)-m , len(b)-m
        ey1bis = [ (a[x+i],b[y+i]) for i in range(m)]
    F.append((clock()-te,
              'eyquem 1 improved, exactly normal slicing and listcomp'
              '\nm = min(len(a),len(b)'
              '\nx,y = len(a)-m , len(b)-m'
              '\n[(a[x+i],b[y+i]) for i in range(m)]'))


    te = clock()
    for rep in range(n):
        ita = iter(a)
        itb = iter(b)
        if len(b)>len(a):
            for av in range(len(b)-len(a)):
                itb.__next__()
        else:
            for av in range(len(a)-len(b)):
                itb.__next__()
        ey2 = list(zip(ita,itb))
    G.append((clock()-te,
              'eyquem 2, use of zip and iterator iter'
              '\nlist(zip(ita,itb))'))


    from itertools import islice
    te = clock()
    for rep in range(n):
        if len(b)>len(a):
            ey3 = list(zip(iter(a) , islice(b,len(b)-len(a),None)))
        else:
            ey3 = list(zip(islice(a,len(a)-len(b),None),iter(b)))
    H.append((clock()-te,
              'eyquem 3, use of zip and iterators iter AND islice'
              '\nlist(zip(iter(a),islice(b,len(b)-len(a),None)))'))


    te = clock()
    for rep in range(n):
        ray = list(reversed(list(zip(reversed(a),reversed(b)))))
    I.append((clock()-te,
              'Raymond Hettinger, use of zip and iterator reversed'
              '\nlist(reversed(list(zip(reversed(a),reversed(b)))))'))


print( 'len(a) == %d\nlen(b) == %d\n' % (len(a),len(b)) )

tempi = [min(x) for x in (A,B,C,D,E,F,G,H,I)]
tempi.sort(reverse=True)
tref = tempi[0][0]/100
print( '\n\n'.join('%.2f %%     %s' % (t/tref,ch) for t,ch in tempi) )


print('\nsam1==sam2==fj==sleep==ey1==ey1bis==ey2==ey3==ray  is ',sam1==sam2==fj==sleep==ey1==ey1bis==ey2==ey3==ray)

A result with lists of very different lengths:

len(a) == 20000
len(b) == 50000

100.00 %     Sam Hocevar, exactly normal slicing and zip
k = min(len(a),len(b))
list(zip(a[len(a)-k:], b[len(b)-k:]))

99.80 %     Sam Hocevar, deducted normal slicing and zip
k = min(len(a),len(b))
list(zip(a[-k:], b[-k:]))

98.02 %     sleeplessnerd, exactly normal slicing and zip
m = min(len(a),len(b))
list(zip(a[len(a)-m:],b[len(b)-m:]))

97.98 %     F.J. , deducted tricky slicing and zip
list(zip(a[-len(b):], b[-len(a):]))

82.30 %     eyquem 2, use of zip and iterator iter
list(zip(ita,itb))

69.61 %     eyquem 1, deducted normal slicing and listcomp
m = min(len(a),len(b))
[ (a[-m+i],b[-m+i]) for i in range(m) ]

67.62 %     eyquem 1 improved, exactly normal slicing and listcomp
m = min(len(a),len(b)
x,y = len(a)-m , len(b)-m
[(a[x+i],b[y+i]) for i in range(m)]

61.23 %     eyquem 3, use of zip and iterators iter AND islice
list(zip(iter(a),islice(b,len(b)-len(a),None)))

60.92 %     Raymond Hettinger, use of zip and iterator reversed
list(reversed(list(zip(reversed(a),reversed(b)))))

sam1==sam2==fj==sleep==ey1==ey1bis==ey2==ey3==ray  is  True

Result with two lists of similar lengths:

len(a) == 49500
len(b) == 50000

100.00 %     Sam Hocevar, deducted normal slicing and zip
k = min(len(a),len(b))
list(zip(a[-k:], b[-k:]))

99.39 %     F.J. , deducted tricky slicing and zip
list(zip(a[-len(b):], b[-len(a):]))

99.12 %     sleeplessnerd, exactly normal slicing and zip
m = min(len(a),len(b))
list(zip(a[len(a)-m:],b[len(b)-m:]))

98.10 %     Sam Hocevar, exactly normal slicing and zip
k = min(len(a),len(b))
list(zip(a[len(a)-k:], b[len(b)-k:]))

69.91 %     eyquem 1, deducted normal slicing and listcomp
m = min(len(a),len(b))
[ (a[-m+i],b[-m+i]) for i in range(m) ]

66.54 %     eyquem 1 improved, exactly normal slicing and listcomp
m = min(len(a),len(b)
x,y = len(a)-m , len(b)-m
[(a[x+i],b[y+i]) for i in range(m)]

58.94 %     Raymond Hettinger, use of zip and iterator reversed
list(reversed(list(zip(reversed(a),reversed(b)))))

51.29 %     eyquem 2, use of zip and iterator iter
list(zip(ita,itb))

51.17 %     eyquem 3, use of zip and iterators iter AND islice
list(zip(iter(a),islice(b,len(b)-len(a),None)))

sam1==sam2==fj==sleep==ey1==ey1bis==ey2==ey3==ray  is  True

I conclude that there are 3 groups of solutions:

  • the solutions of Sam Hocevar, FJ and sleeplessnerd have the longest executions.
    They use different combinations of slicing and function zip().
    By the way the overwhelmingly approved Sam's solution is the longest.

  • my function rzip() and rzip()-improved execute in 68 % of the time taken by the ones of the above group.
    They use only a list comprehension of elements obtained by their indices

  • my solutions eyquem2 and eyquem3 , and the Raymond Hettinger's solution, take less than 61 % of the said preceding times.
    They use iterators.

The execution's times of my solutions depend on the relative lengths of the list:
for lists of similar lengths, solutions eyquem2 and eyquem3 have execution's times divided by 2 !
But for very different lists , eyquem2 surprisingly takes as much as 82 % of the reference time.

Constant execution's time around 60 % of reference time of Raymond Hettinger's solution confirms his claim.
My solution eyquem3 with iter() and islice() is not so bad too, but it isn't so clear than Raymond Hettinger's one, though this latter is a little obscure too at first glance.




回答5:


zip(a[len(a)-min(len(a), len(b)):], b[len(b)-min(len(a), len(b)):])

but it does not look very "pythonic" :P




回答6:


[Updated to add possibility of reversed ordering (because I thought I might use this one day)]

As a generator, you could try,

def reversed_zip(*l, from_right=False):

    i = min(len(l) for l in l)
    i, j, k = (1, i+1, 1) if from_right else (i, 0, -1) 

    while i != j:
        yield tuple( x[-i] for x in l ) 
        i += k

This doesn't have to take copies of the lists, and gives,

>>> a = [1,2,3]
>>> b = [4,5,6,7]

>>> list(reversed_zip(a, b))
[(1, 5), (2, 6), (3, 7)]

>>> list(reversed_zip(a, b, from_right=True))
[(3, 7), (2, 6), (1, 5)]


来源:https://stackoverflow.com/questions/8013071/python-whats-the-fastest-way-to-zip-right-to-left-and-is-there-no-builtin-for

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!