vectorization

How do I shift my thinking to 'vectorize my computation' more than using 'for-loops'?

天涯浪子 提交于 2019-12-21 06:14:48
问题 This is definitely more of a notional question, but I wanted to get others expertise input on this topic at SO. Most of my programming is coming from Numpy arrays lately. I've been matching items in two or so arrays that are different in sizes. Most of the time I will go to a for-loop or even worst, nested for-loop. I'm ultimately trying to avoid using for-loops as I try to gain more experience in Data Science because for-loops perform slower. I am well aware of Numpy and the pre-defined cmds

How do I shift my thinking to 'vectorize my computation' more than using 'for-loops'?

拥有回忆 提交于 2019-12-21 06:14:47
问题 This is definitely more of a notional question, but I wanted to get others expertise input on this topic at SO. Most of my programming is coming from Numpy arrays lately. I've been matching items in two or so arrays that are different in sizes. Most of the time I will go to a for-loop or even worst, nested for-loop. I'm ultimately trying to avoid using for-loops as I try to gain more experience in Data Science because for-loops perform slower. I am well aware of Numpy and the pre-defined cmds

How to apply corr2 functions in Multidimentional arrays in Matlab?

↘锁芯ラ 提交于 2019-12-21 04:59:12
问题 Let's say I have two matrices A and B A = rand(4,5,3); B = rand(4,5,6) I want to apply the function 'corr2' to calculate the correlation coefficients. corr2(A(:,:,1),B(:,:,1)) corr2(A(:,:,1),B(:,:,2)) corr2(A(:,:,1),B(:,:,3)) ... corr2(A(:,:,1),B(:,:,6)) ... corr2(A(:,:,2),B(:,:,1)) corr2(A(:,:,2),B(:,:,2)) ... corr2(A(:,:,3),B(:,:,6)) How to avoid using loops to create such a vectorization? 回答1: Hacked into the m-file for corr2 to create a customized vectorized version for working with 3D

vectorized radix sort with numpy - can it beat np.sort?

Deadly 提交于 2019-12-21 04:39:24
问题 Numpy doesn't yet have a radix sort, so I wondered whether it was possible to write one using pre-existing numpy functions. So far I have the following, which does work, but is about 10 times slower than numpy's quicksort. Test and benchmark: a = np.random.randint(0, 1e8, 1e6) assert(np.all(radix_sort(a) == np.sort(a))) %timeit np.sort(a) %timeit radix_sort(a) The mask_b loop can be at least partially vectorized, broadcasting out across masks from & , and using cumsum with axis arg, but that

Tensorflow indicator matrix for top n values

时间秒杀一切 提交于 2019-12-21 04:35:12
问题 Does anyone know how to extract the top n largest values per row of a rank 2 tensor? For instance, if I wanted the top 2 values of a tensor of shape [2,4] with values: [[40, 30, 20, 10], [10, 20, 30, 40]] The desired condition matrix would look like: [[True, True, False, False],[False, False, True, True]] Once I have the condition matrix, I can use tf.select to choose actual values. Thank you for assistance! 回答1: You can do it using built-in tf.nn.top_k function: a = tf.convert_to_tensor([[40

Does /arch:AVX enable AVX2?

喜夏-厌秋 提交于 2019-12-21 04:29:26
问题 I can't find an answer to this simple question, does the /arch:AVX enable AVX2 with its fancy 256 bit registers on the Visual Studio 2012 Update 4? Line of thought: Yes, it enables AVX because VS doesn't mention AVX2. But I think VS can do AVX2 because my intrinsic work. No, it doesn't because SSE and SSE2 are separate 回答1: It refers to AVX not AVX2. According to Microsoft this applies (mostly) to floating point operations. VS2012 supports AVX2 intrinsic functions regardless of this flag. AVX

Is there a way to show where LLVM is auto vectorising?

主宰稳场 提交于 2019-12-21 04:22:08
问题 Context: I have several loops in an Objective-C library I am writing which deal with processing large text arrays. I can see that right now it is running in a single threaded manner. I understand that LLVM is now capable of auto-vectorising loops, as described at Apple's session at WWDC. It is however very cautious in the way it does it, one reason being the possibility of variables being modified due to CPU pipelining. My question: how can I see where LLVM has vectorised my code, and, more

Analysis of a 3D point cloud by projection in a 2D surface

柔情痞子 提交于 2019-12-20 23:40:11
问题 I have a 3D point cloud (XYZ) where the Z can be position or energy. I want to project them on a 2D surface in a n -by- m grid (in my problem n = m ) in a manner that each grid cell has a value of the maximum difference of Z , in case of Z being position, or a value of summation over Z , in case of Z being energy. For example, in a range of 0 <= (x,y) <= 20 , there are 500 points. Let's say the xy-plane has n -by- m partitions, e.g. 4 -by- 4 ; by which I mean in both x and y directions we

SSE slower than FPU?

China☆狼群 提交于 2019-12-20 20:18:32
问题 I have a large piece of code, part of whose body contains this piece of code: result = (nx * m_Lx + ny * m_Ly + m_Lz) / sqrt(nx * nx + ny * ny + 1); which I have vectorized as follows (everything is already a float ): __m128 r = _mm_mul_ps(_mm_set_ps(ny, nx, ny, nx), _mm_set_ps(ny, nx, m_Ly, m_Lx)); __declspec(align(16)) int asInt[4] = { _mm_extract_ps(r,0), _mm_extract_ps(r,1), _mm_extract_ps(r,2), _mm_extract_ps(r,3) }; float (&res)[4] = reinterpret_cast<float (&)[4]>(asInt); result = (res

Direction of two points

半腔热情 提交于 2019-12-20 16:25:02
问题 Some high school math concept has been forgotten, so I ask here. If I have two points p1(x1,y1) , p2(x2,y2) , the direction is P1-->p2 , that's p1 points to p2 . To represent this direction by vector, is it Vector(x2-x1,y2-y1) or Vector(x1-x2, y1-y2) ? By the way, what is the purpose to normalize a vector? 回答1: Answer 1: it is Vector(x2-x1,y2-y1) Answer 2: Normalizing means to scale the vector so that its length is 1. It is a useful operation in many computations, for example, normal vectors