vectorization | 易学教程

How do I shift my thinking to 'vectorize my computation' more than using 'for-loops'?

阅读更多关于 How do I shift my thinking to 'vectorize my computation' more than using 'for-loops'?

问题 This is definitely more of a notional question, but I wanted to get others expertise input on this topic at SO. Most of my programming is coming from Numpy arrays lately. I've been matching items in two or so arrays that are different in sizes. Most of the time I will go to a for-loop or even worst, nested for-loop. I'm ultimately trying to avoid using for-loops as I try to gain more experience in Data Science because for-loops perform slower. I am well aware of Numpy and the pre-defined cmds

How do I shift my thinking to 'vectorize my computation' more than using 'for-loops'?

阅读更多关于 How do I shift my thinking to 'vectorize my computation' more than using 'for-loops'?

How to apply corr2 functions in Multidimentional arrays in Matlab?

阅读更多关于 How to apply corr2 functions in Multidimentional arrays in Matlab?

问题 Let's say I have two matrices A and B A = rand(4,5,3); B = rand(4,5,6) I want to apply the function 'corr2' to calculate the correlation coefficients. corr2(A(:,:,1),B(:,:,1)) corr2(A(:,:,1),B(:,:,2)) corr2(A(:,:,1),B(:,:,3)) ... corr2(A(:,:,1),B(:,:,6)) ... corr2(A(:,:,2),B(:,:,1)) corr2(A(:,:,2),B(:,:,2)) ... corr2(A(:,:,3),B(:,:,6)) How to avoid using loops to create such a vectorization? 回答1: Hacked into the m-file for corr2 to create a customized vectorized version for working with 3D

vectorized radix sort with numpy - can it beat np.sort?

阅读更多关于 vectorized radix sort with numpy - can it beat np.sort?

问题 Numpy doesn't yet have a radix sort, so I wondered whether it was possible to write one using pre-existing numpy functions. So far I have the following, which does work, but is about 10 times slower than numpy's quicksort. Test and benchmark: a = np.random.randint(0, 1e8, 1e6) assert(np.all(radix_sort(a) == np.sort(a))) %timeit np.sort(a) %timeit radix_sort(a) The mask_b loop can be at least partially vectorized, broadcasting out across masks from & , and using cumsum with axis arg, but that

Tensorflow indicator matrix for top n values

阅读更多关于 Tensorflow indicator matrix for top n values

问题 Does anyone know how to extract the top n largest values per row of a rank 2 tensor? For instance, if I wanted the top 2 values of a tensor of shape [2,4] with values: [[40, 30, 20, 10], [10, 20, 30, 40]] The desired condition matrix would look like: [[True, True, False, False],[False, False, True, True]] Once I have the condition matrix, I can use tf.select to choose actual values. Thank you for assistance! 回答1: You can do it using built-in tf.nn.top_k function: a = tf.convert_to_tensor([[40

Does /arch:AVX enable AVX2?

阅读更多关于 Does /arch:AVX enable AVX2?

问题 I can't find an answer to this simple question, does the /arch:AVX enable AVX2 with its fancy 256 bit registers on the Visual Studio 2012 Update 4? Line of thought: Yes, it enables AVX because VS doesn't mention AVX2. But I think VS can do AVX2 because my intrinsic work. No, it doesn't because SSE and SSE2 are separate 回答1: It refers to AVX not AVX2. According to Microsoft this applies (mostly) to floating point operations. VS2012 supports AVX2 intrinsic functions regardless of this flag. AVX

Is there a way to show where LLVM is auto vectorising?

阅读更多关于 Is there a way to show where LLVM is auto vectorising?

问题 Context: I have several loops in an Objective-C library I am writing which deal with processing large text arrays. I can see that right now it is running in a single threaded manner. I understand that LLVM is now capable of auto-vectorising loops, as described at Apple's session at WWDC. It is however very cautious in the way it does it, one reason being the possibility of variables being modified due to CPU pipelining. My question: how can I see where LLVM has vectorised my code, and, more

Analysis of a 3D point cloud by projection in a 2D surface

阅读更多关于 Analysis of a 3D point cloud by projection in a 2D surface

问题 I have a 3D point cloud (XYZ) where the Z can be position or energy. I want to project them on a 2D surface in a n -by- m grid (in my problem n = m ) in a manner that each grid cell has a value of the maximum difference of Z , in case of Z being position, or a value of summation over Z , in case of Z being energy. For example, in a range of 0 <= (x,y) <= 20 , there are 500 points. Let's say the xy-plane has n -by- m partitions, e.g. 4 -by- 4 ; by which I mean in both x and y directions we

SSE slower than FPU?

阅读更多关于 SSE slower than FPU?

问题 I have a large piece of code, part of whose body contains this piece of code: result = (nx * m_Lx + ny * m_Ly + m_Lz) / sqrt(nx * nx + ny * ny + 1); which I have vectorized as follows (everything is already a float ): __m128 r = _mm_mul_ps(_mm_set_ps(ny, nx, ny, nx), _mm_set_ps(ny, nx, m_Ly, m_Lx)); __declspec(align(16)) int asInt[4] = { _mm_extract_ps(r,0), _mm_extract_ps(r,1), _mm_extract_ps(r,2), _mm_extract_ps(r,3) }; float (&res)[4] = reinterpret_cast<float (&)[4]>(asInt); result = (res

Direction of two points

阅读更多关于 Direction of two points

问题 Some high school math concept has been forgotten, so I ask here. If I have two points p1(x1,y1) , p2(x2,y2) , the direction is P1-->p2 , that's p1 points to p2 . To represent this direction by vector, is it Vector(x2-x1,y2-y1) or Vector(x1-x2, y1-y2) ? By the way, what is the purpose to normalize a vector? 回答1: Answer 1: it is Vector(x2-x1,y2-y1) Answer 2: Normalizing means to scale the vector so that its length is 1. It is a useful operation in many computations, for example, normal vectors