Faster numpy-solution instead of itertools.combinations?

ぃ、小莉子 提交于 2019-11-29 07:11:54

Here's one that's slightly faster than itertools UPDATE: and one (nump2) that's actually quite a bit faster:

import numpy as np
import itertools
import timeit

def nump(n, k, i=0):
    if k == 1:
        a = np.arange(i, i+n)
        return tuple([a[None, j:] for j in range(n)])
    template = nump(n-1, k-1, i+1)
    full = np.r_[np.repeat(np.arange(i, i+n-k+1),
                           [t.shape[1] for t in template])[None, :],
                 np.c_[template]]
    return tuple([full[:, j:] for j in np.r_[0, np.add.accumulate(
        [t.shape[1] for t in template[:-1]])]])

def nump2(n, k):
    a = np.ones((k, n-k+1), dtype=int)
    a[0] = np.arange(n-k+1)
    for j in range(1, k):
        reps = (n-k+j) - a[j-1]
        a = np.repeat(a, reps, axis=1)
        ind = np.add.accumulate(reps)
        a[j, ind[:-1]] = 1-reps[1:]
        a[j, 0] = j
        a[j] = np.add.accumulate(a[j])
    return a

def itto(L, N):
    return np.array([a for a in itertools.combinations(L,N)]).T

k = 6
n = 12
N = np.arange(n)

assert np.all(nump2(n,k) == itto(N,k))

print('numpy    ', timeit.timeit('f(a,b)', number=100, globals={'f':nump, 'a':n, 'b':k}))
print('numpy 2  ', timeit.timeit('f(a,b)', number=100, globals={'f':nump2, 'a':n, 'b':k}))
print('itertools', timeit.timeit('f(a,b)', number=100, globals={'f':itto, 'a':N, 'b':k}))

Timings:

k = 3, n = 50
numpy     0.06967267207801342
numpy 2   0.035096961073577404
itertools 0.7981023890897632

k = 3, n = 10
numpy     0.015058324905112386
numpy 2   0.0017436158377677202
itertools 0.004743851954117417

k = 6, n = 12
numpy     0.03546895203180611
numpy 2   0.00997065706178546
itertools 0.05292179994285107

This is is most certainly not faster than itertools.combinations but it is vectorized numpy:

def nd_triu_indices(T,N):
    o=np.array(np.meshgrid(*(np.arange(len(T)),)*N))
    return np.array(T)[o[...,np.all(o[1:]>o[:-1],axis=0)]]

 %timeit np.array(list(itertools.combinations(T,N))).T
The slowest run took 4.40 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 8.6 µs per loop

%timeit nd_triu_indices(T,N)
The slowest run took 4.64 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 52.4 µs per loop

Not sure if this is vectorizable another way, or if one of the optimization wizards around here can make this method faster.

EDIT: Came up with another way, but still not faster than combinations:

%timeit np.array(T)[np.array(np.where(np.fromfunction(lambda *i: np.all(np.array(i)[1:]>np.array(i)[:-1], axis=0),(len(T),)*N,dtype=int)))]
The slowest run took 7.78 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 34.3 µs per loop
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!