vectorization | 易学教程

Vectorization to calculate many distances

阅读更多关于 Vectorization to calculate many distances

问题 I am new to numpy/pandas and vectorized computation. I am doing a data task where I have two datasets. Dataset 1 contains a list of places with their longitude and latitude and a variable A. Dataset 2 also contains a list of places with their longitude and latitude. For each place in dataset 1, I would like to calculate its distances to all the places in dataset 2 but I would only like to get a count of places in dataset 2 that are less than the value of variable A. Note also both of the

Make 2D Numpy array from coordinates

阅读更多关于 Make 2D Numpy array from coordinates

问题 I have data points that represent a coordinates for a 2D array (matrix). The points are regularly gridded, except that data points are missing from some grid positions. For example, consider some XYZ data that fits on a regular 0.1 grid with shape (3, 4). There are gaps and missing points, so there are 5 points, and not 12: import numpy as np X = np.array([0.4, 0.5, 0.4, 0.4, 0.7]) Y = np.array([1.0, 1.0, 1.1, 1.2, 1.2]) Z = np.array([3.3, 2.5, 3.6, 3.8, 1.8]) # Evaluate the regular grid

Fastest way to extract dictionary of sums in numpy in 1 I/O pass

阅读更多关于 Fastest way to extract dictionary of sums in numpy in 1 I/O pass

问题 Let's say I have an array like: arr = np.array([[1,20,5], [1,20,8], [3,10,4], [2,30,6], [3,10,5]]) and I would like to form a dictionary of the sum of the third column for each row that matches each value in the first column, i.e. return {1: 13, 2: 6, 3: 9} . To make matters more challenging, there's 1 billion rows in my array and 100k unique elements in the first column. Approach 1: Naively, I can invoke np.unique() then iterate through each item in the unique array with a combination of np

Strange uint32_t to float array conversion

阅读更多关于 Strange uint32_t to float array conversion

问题 I have the following code snippet: #include <cstdio> #include <cstdint> static const size_t ARR_SIZE = 129; int main() { uint32_t value = 2570980487; uint32_t arr[ARR_SIZE]; for (int x = 0; x < ARR_SIZE; ++x) arr[x] = value; float arr_dst[ARR_SIZE]; for (int x = 0; x < ARR_SIZE; ++x) { arr_dst[x] = static_cast<float>(arr[x]); } printf("%s\n", arr_dst[ARR_SIZE - 1] == arr_dst[ARR_SIZE - 2] ? "OK" : "WTF??!!"); printf("magic = %0.10f\n", arr_dst[ARR_SIZE - 2]); printf("magic = %0.10f\n", arr

SIMD/SSE: How to check that all vector elements are non-zero

阅读更多关于 SIMD/SSE: How to check that all vector elements are non-zero

问题 I need to check that all vector elements are non-zero. So far I found following solution. Is there a better way to do this? I am using gcc 4.8.2 on Linux/x86_64, instructions up to SSE4.2. typedef char ChrVect __attribute__((vector_size(16), aligned(16))); inline bool testNonzero(ChrVect vect) { const ChrVect vzero = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}; return (0 == (__int128_t)(vzero == vect)); } Update: code above is compiled to following assembler code (when compiled as non-inline function):

Vectorized C# code with SIMD using Vector<T> running slower than classic loop

阅读更多关于 Vectorized C# code with SIMD using Vector running slower than classic loop

问题 I've seen a few articles describing how Vector<T> is SIMD-enabled and is implemented using JIT intrinsics so the compiler will correctly output AVS/SSE/... instructions when using it, allowing much faster code than classic, linear loops (example here). I decided to try to rewrite a method I have to see if I managed to get some speedup, but so far I failed and the vectorized code is running 3 times slower than the original, and I'm not exactly sure as to why. Here are two versions of a method

Vector (array) addition in Postgres

阅读更多关于 Vector (array) addition in Postgres

问题 I have a column with numeric[] values which all have the same size. I'd like to take their element-wise average. By this I mean that the average of {1, 2, 3}, {-1, -2, -3}, and {3, 3, 3} should be {1, 1, 1} . Also of interest is how to sum these element-wise, although I expect that any solution for one will be a solution for the other. (NB: The length of the arrays is fixed within a single table, but may vary between tables. So I need a solution which doesn't assume a certain length.) My

ARM Neon: How to convert from uint8x16_t to uint8x8x2_t?

阅读更多关于 ARM Neon: How to convert from uint8x16_t to uint8x8x2_t?

问题 I recently discovered about the vreinterpret{q}_dsttype_srctype casting operator. However this doesn't seem to support conversion in the data type described at this link (bottom of the page): Some intrinsics use an array of vector types of the form: <type><size>x<number of lanes>x<length of array>_t These types are treated as ordinary C structures containing a single element named val. An example structure definition is: struct int16x4x2_t { int16x4_t val[2]; }; Do you know how to convert

Elegant vectorized version of CHANGEM (substitute values) - MATLAB

阅读更多关于 Elegant vectorized version of CHANGEM (substitute values) - MATLAB

问题 In Matlab 2012b, there is a changem function that allows you to substitute elements of a matrix with other values specified by a set of keys: Substitute values in data array Is there an elegant/vectorized way to do the same if I don't have the Mapping toolbox? 回答1: Yes, use ismember : A = magic(3); oldCode = [ 8 9]; newCode = [12 13]; [a,b] = ismember(A,oldCode); A(a) = newCode(b(a)); I don't know changem , and I suspect the above will not fully cover its functionality (why else would TMW

How can I disable vectorization while using GCC?

阅读更多关于 How can I disable vectorization while using GCC?

问题 I am compiling my code using following command: gcc -O3 -ftree-vectorizer-verbose=6 -msse4.1 -ffast-math With this all the optimizations are enabled. But I want to disable vectorization while keeping the other optimizations. 回答1: Most of the GCC switches can be used with a no prefix to disable their behavior. Try with -fno-tree-vectorize (after -O3 on the command line). 回答2: you can also selectively enable and disable vectorization with the optimize function attributes or pragmas http://gcc