vectorization | 易学教程

Efficient search for permutations that contain sub-permutations via array operations?

阅读更多关于 Efficient search for permutations that contain sub-permutations via array operations?

问题 I have a set of integers, say S = {1,...,10}, and two matrices N and M, whose rows are some (but not necessarily all possible) permutations of elements from S of orders, say, 3 and 5 respectively, e.g. N = [1 2 3; 2 5 3;...], M = [1 2 3 4 5; 2 4 7 8 1;...]. A sub-permutation Q of a permutation P is just an indexed subset of P such that the order of the indices of the elements of Q is the same as the order of their indices in P. Example: [2,4,7] is a sub-permutation of [2,3,4,6,7,1], but [1,2

Using OpenMP stops GCC auto vectorising

阅读更多关于 Using OpenMP stops GCC auto vectorising

问题 I have been working on making my code able to be auto vectorised by GCC, however, when I include the the -fopenmp flag it seems to stop all attempts at auto vectorisation. I am using the ftree-vectorize -ftree-vectorizer-verbose=5 to vectorise and monitor it. If I do not include the flag, it starts to give me a lot of information about each loop, if it is vectorised and why not. The compiler stops when I try to use the omp_get_wtime() function, since it can't be linked. Once the flag is

Is there a reason to prefer '&&' over '&' in 'if' statements, other than short-circuiting?

阅读更多关于 Is there a reason to prefer '&&' over '&' in 'if' statements, other than short-circuiting?

问题 Yes I know, there have been a number of questions (see this one, for example) regarding the usage of & vs. && in R, but I have not found one that specifically answers my question. As I understand the differences, & does element-wise, vectorised comparison, much like the other arithmetic operations. It hence returns a logical vector that has length > 1 if both arguments have length > 1. && compares the first elements of both vectors and always returns a result of length 1. Moreover, it does

Auto-Vectorize comparison

阅读更多关于 Auto-Vectorize comparison

问题 I've problems getting my g++ 5.4 use vectorization for comparison. Basically I want to compare 4 unsigned ints using vectorization. My first approach was straight forward: bool compare(unsigned int const pX[4]) { bool c1 = (temp[0] < 1); bool c2 = (temp[1] < 2); bool c3 = (temp[2] < 3); bool c4 = (temp[3] < 4); return c1 && c2 && c3 && c4; } Compiling with g++ -std=c++11 -Wall -O3 -funroll-loops -march=native -mtune=native -ftree-vectorize -msse -msse2 -ffast-math -fopt-info-vec-missed told

Reverse a AVX register containing doubles using a single AVX intrinsic

阅读更多关于 Reverse a AVX register containing doubles using a single AVX intrinsic

问题 If I have an AVX register with 4 doubles in them and I want to store the reverse of this in another register, is it possible to do this with a single intrinsic command? For example: If I had 4 floats in a SSE register, I could use: _mm_shuffle_ps(A,A,_MM_SHUFFLE(0,1,2,3)); Can I do this using, maybe _mm256_permute2f128_pd() ? I don't think you can address each individual double using the above intrinsic. 回答1: You actually need 2 permutes to do this: _mm256_permute2f128_pd() only permutes in

Remove for loop from clustering algorithm in MATLAB

阅读更多关于 Remove for loop from clustering algorithm in MATLAB

问题 I am trying to improve the performance of the OPTICS clustering algorithm. The implementation i've found in open source makes a use of a for loop for each sample and can run for hours... I believe some use of repmat() function may aid in improving its performance when the system has enough amount of RAM. You are more than welcome to suggest other ways of improving the implementation. Here is the code: x is the data: a [mxn] array where m is the sample size and n is the feature dimensionality,

New Dataframe column as a generic function of other rows (pandas)

阅读更多关于 New Dataframe column as a generic function of other rows (pandas)

问题 What is the fastest (and most efficient) way to create a new column in a DataFrame that is a function of other rows in pandas ? Consider the following example: import pandas as pd d = { 'id': [1, 2, 3, 4, 5, 6], 'word': ['cat', 'hat', 'hag', 'hog', 'dog', 'elephant'] } pandas_df = pd.DataFrame(d) Which yields: id word 0 1 cat 1 2 hat 2 3 hag 3 4 hog 4 5 dog 5 6 elephant Suppose I want to create a new column bar containing a value that is based on the output of using a function foo to compare

Numpy: assigning values to 2d array with list of indices

阅读更多关于 Numpy: assigning values to 2d array with list of indices

问题 I have 2d numpy array (think greyscale image). I want to assign certain value to a list of coordinates to this array, such that: img = np.zeros((5, 5)) coords = np.array([[0, 1], [1, 2], [2, 3], [3, 4]]) def bad_use_of_numpy(img, coords): for i, coord in enumerate(coords): img[coord[0], coord[1]] = 255 return img bad_use_of_numpy(img, coords) This works, but I feel like I can take advantage of numpy functionality to make it faster. I also might have a use case later to to something like

How to compute sum of binomial more efficiently?

阅读更多关于 How to compute sum of binomial more efficiently?

问题 I must calculate an equation as follows: where k1,k2 are given. I am using MATLAB to compute P . I think I have a correct implementation for the above equation. However, my implementation is so slow. I think the issue is from binomial coefficient. From the equation, could I have an efficient way to speed up the time? Thank all. For k1=150; k2=150; D=200; , it takes 11.6 seconds function main warning ('off'); function test_binom() k1=150; k2=150; D=200; P=0; for i=0:D-1 for j=0:i if (i-j>k2||j

How to Build a Distance Matrix without a Loop (Vectorization)?

阅读更多关于 How to Build a Distance Matrix without a Loop (Vectorization)?

问题 I have many points and I want to build distance matrix i.e. distance of every point with all of other points but I want to don't use from loop because take too time... Is a better way for building this matrix? this is my loop: for a setl with size: 10000x3 this method take a lot of my time :( for i=1:size(setl,1) for j=1:size(setl,1) dist = sqrt((xl(i)-xl(j))^2+(yl(i)-yl(j))^2+... (zl(i)-zl(j))^2); distanceMatrix(i,j) = dist; end end 回答1: How about using some linear algebra? The distance of