vectorization | 易学教程

How to sample a numpy array and perform computation on each sample efficiently?

阅读更多关于 How to sample a numpy array and perform computation on each sample efficiently?

问题 Assume I have a 1d array, what I want is to sample with a moving window and within the window divide each element by the first element. For example if I have [2, 5, 8, 9, 6] and a window size of 3, the result will be [[1, 2.5, 4], [1, 1.6, 1.8], [1, 1.125, 0.75]]. What I'm doing now is basically a for loop import numpy as np arr = np.array([2., 5., 8., 9., 6.]) window_size = 3 for i in range(len(arr) - window_size + 1): result.append(arr[i : i + window_size] / arr[i]) etc. When the array is

What do gcc's auto-vectorization messages mean?

阅读更多关于 What do gcc's auto-vectorization messages mean?

问题 I have some code that I would like to run fast, so I was hoping I could persuade gcc (g++) to vectorise some of my inner loops. My compiler flags include -O3 -msse2 -ffast-math -ftree-vectorize -ftree-vectorizer-verbose=5 but gcc fails to vectorize the most important loops, giving me the following not-really-very-verbose-at-all messages: Not vectorized: complicated access pattern. and Not vectorized: unsupported use in stmt. My questions are (1) what exactly do these mean? (How complicated

How to build “vectorized” building blocks using itertools module?

阅读更多关于 How to build “vectorized” building blocks using itertools module?

问题 The recipe section of itertools docs begins with this text: The extended tools offer the same high performance as the underlying toolset. The superior memory performance is kept by processing elements one at a time rather than bringing the whole iterable into memory all at once. Code volume is kept small by linking the tools together in a functional style which helps eliminate temporary variables. High speed is retained by preferring “vectorized” building blocks over the use of for-loops and

Vectorizing array indexing/subsetting in Matlab

阅读更多关于 Vectorizing array indexing/subsetting in Matlab

问题 Suppose I have a long data vector y, plus some indices into it. I want to extract a short snippet or window around every index. For example, suppose I want to construct a matrix containing 64 samples before and 64 samples after every value that is below three. This is trivial to do in a for-loop: WIN_SIZE = 64; % Sample data with padding data = [nan(WIN_SIZE,1); randn(1e6,1); nan(WIN_SIZE,1)]; % Sample events, could be anything index = find(data < 3); snippets = nan(length(index), 2*WIN_SIZE

Get length of runs of missing values in vector

阅读更多关于 Get length of runs of missing values in vector

问题 What's a clever (i.e., not a loop) way to get the length of each spell of missing values in a vector? My ideal output is a vector that is the same length, in which each missing value is replaced by the length of the spell of missing values of which it was a part, and all other values are 0's. So, for input like: x <- c(2,6,1,2,NA,NA,NA,3,4,NA,NA) I'd like output like: y <- c(0,0,0,0,3,3,3,0,0,2,2) 回答1: One simple option using rle : m <- rle(is.na(x)) > rep(ifelse(m$values,m$lengths,0),times =

Finding intersection of two matrices in Python within a tolerance?

阅读更多关于 Finding intersection of two matrices in Python within a tolerance?

问题 I'm looking for the most efficient way of finding the intersection of two different-sized matrices. Each matrix has three variables (columns) and a varying number of observations (rows). For example, matrix A: a = np.matrix('1 5 1003; 2 4 1002; 4 3 1008; 8 1 2005') b = np.matrix('7 9 1006; 4 4 1007; 7 7 1050; 8 2 2003'; 9 9 3000; 7 7 1000') If I set the tolerance for each column as col1 = 1 , col2 = 2 , and col3 = 10 , I would want a function such that it would output the indices in a and b

Numpy repeat for 2d array

阅读更多关于 Numpy repeat for 2d array

问题 Given two arrays, say arr = array([10, 24, 24, 24, 1, 21, 1, 21, 0, 0], dtype=int32) rep = array([3, 2, 2, 0, 0, 0, 0, 0, 0, 0], dtype=int32) np.repeat(arr, rep) returns array([10, 10, 10, 24, 24, 24, 24], dtype=int32) Is there any way to replicate this functionality for a set of 2D arrays? That is given arr = array([[10, 24, 24, 24, 1, 21, 1, 21, 0, 0], [10, 24, 24, 1, 21, 1, 21, 32, 0, 0]], dtype=int32) rep = array([[3, 2, 2, 0, 0, 0, 0, 0, 0, 0], [2, 2, 2, 0, 0, 0, 0, 0, 0, 0]], dtype

How can I vectorize code that runs a function on subsets of a larger matrix?

阅读更多关于 How can I vectorize code that runs a function on subsets of a larger matrix?

问题 Let's assume I have the following 9 x 5 matrix: myArray = [ 54.7 8.1 81.7 55.0 22.5 29.6 92.9 79.4 62.2 17.0 74.4 77.5 64.4 58.7 22.7 18.8 48.6 37.8 20.7 43.5 68.6 43.5 81.1 30.1 31.1 18.3 44.6 53.2 47.0 92.3 36.8 30.6 35.0 23.0 43.0 62.5 50.8 93.9 84.4 18.4 78.0 51.0 87.5 19.4 90.4 ]; I have 11 "subsets" of this matrix and I need to run a function (let's say max ) on each of these subsets. The subsets can be identified with the following matirx of logicals (identified column-wise, not row

Loop over (or vectorize) variable length matrices in Theano

阅读更多关于 Loop over (or vectorize) variable length matrices in Theano

问题 I have a list of matrices L , where each item M is a x*n matrix ( x is a variable, n is a constant). I want to compute the sum of M'*M for all items in L ( M' is the transpose of M ) as the following Python code does: for M in L: res += np.dot(M.T, M) Actually I want to implement this in Theano (which doesn't support variable length multidimensional arrays), and I don't want to pad all matrices to the same size because that will waste too much space (some of the matrices can be very large).

How to sum parts of a matrix of different sizes, without using for loops?

阅读更多关于 How to sum parts of a matrix of different sizes, without using for loops?

问题 I have a relatively large matrix NxN (N~20,000) and a Nx1 vector identifying the indices that must be grouped together. I want to sum together parts of the matrix, which in principle can have a different number of elements and non-adjacent elements. I quickly wrote a double for-loop that works correctly but of course it is inefficient. The profiler identified these loops as one of the bottlenecks in my code. I tried to find a smart vectorization method to solve the problem. I explored the