Numpy-vectorized function to repeat blocks of consecutive elements

问题

Numpy has а repeat function, that repeats each element of the array a given (per element) number of times.

I want to implement a function that does similar thing but repeats not individual elements, but variably sized blocks of consecutive elements. Essentially I want the following function:

import numpy as np

def repeat_blocks(a, sizes, repeats):
    b = []    
    start = 0
    for i, size in enumerate(sizes):
        end = start + size
        b.extend([a[start:end]] * repeats[i])
        start = end
    return np.concatenate(b)

For example, given

a = np.arange(20)
sizes = np.array([3, 5, 2, 6, 4])
repeats = np.array([2, 3, 2, 1, 3])

then

repeat_blocks(a, sizes, repeats)

returns

array([ 0,  1,  2, 
        0,  1,  2,

        3,  4,  5,  6,  7, 
        3,  4,  5,  6,  7, 
        3,  4,  5,  6,  7, 

        8,  9, 
        8,  9,

        10, 11, 12, 13, 14, 15,

        16, 17, 18, 19,
        16, 17, 18, 19,
        16, 17, 18, 19 ])

I want to push these loops into numpy in the name of performance. Is this possible? If so, how?

回答1:

Here's one vectorized approach using cumsum -

# Get repeats for each group using group lengths/sizes
r1 = np.repeat(np.arange(len(sizes)), repeats)

# Get total size of output array, as needed to initialize output indexing array
N = (sizes*repeats).sum() # or np.dot(sizes, repeats)

# Initialize indexing array with ones as we need to setup incremental indexing
# within each group when cumulatively summed at the final stage. 
# Two steps here:
# 1. Within each group, we have multiple sequences, so setup the offsetting
# at each sequence lengths by the seq. lengths preceeeding those.
id_ar = np.ones(N, dtype=int)
id_ar[0] = 0
insert_index = sizes[r1[:-1]].cumsum()
insert_val = (1-sizes)[r1[:-1]]

# 2. For each group, make sure the indexing starts from the next group's
# first element. So, simply assign 1s there.
insert_val[r1[1:] != r1[:-1]] = 1

# Assign index-offseting values
id_ar[insert_index] = insert_val

# Finally index into input array for the group repeated o/p
out = a[id_ar.cumsum()]

回答2:

This function is a great candidate to speed up using Numba:

@numba.njit
def repeat_blocks_jit(a, sizes, repeats):
    out = np.empty((sizes * repeats).sum(), a.dtype)
    start = 0
    oi = 0
    for i, size in enumerate(sizes):
        end = start + size
        for rep in range(repeats[i]):
            oe = oi + size
            out[oi:oe] = a[start:end]
            oi = oe
        start = end
    return out

This is significantly faster than Divakar's pure NumPy solution, and a lot closer to your original code. I made no effort at all to optimize it. Note that np.dot() and np.repeat() can't be used here, but that doesn't matter when all the code gets compiled.

Plus, since it is njit meaning "nopython" mode, you can even use @numba.njit(nogil=True) and get multicore speedup if you have many of these calls to make.

来源：https://stackoverflow.com/questions/51154989/numpy-vectorized-function-to-repeat-blocks-of-consecutive-elements

标签

python

algorithm

numpy

vectorization