dot-product

AVX2: Computing dot product of 512 float arrays

佐手、 提交于 2020-07-14 17:42:48
问题 I will preface this by saying that I am a complete beginner at SIMD intrinsics. Essentially, I have a CPU which supports the AVX2 instrinsic ( Intel(R) Core(TM) i5-7500T CPU @ 2.70GHz ). I would like to know the fastest way to compute the dot product of two std::vector<float> of size 512 . I have done some digging online and found this and this, and this stack overflow question suggests using the following function __m256 _mm256_dp_ps(__m256 m1, __m256 m2, const int mask); , However, these

Speeding up numpy.dot

筅森魡賤 提交于 2020-01-22 07:31:31
问题 I've got a numpy script that spends about 50% of its runtime in the following code: s = numpy.dot(v1, v1) where v1 = v[1:] and v is a 4000-element 1D ndarray of float64 stored in contiguous memory ( v.strides is (8,) ). Any suggestions for speeding this up? edit This is on Intel hardware. Here is the output of my numpy.show_config() : atlas_threads_info: libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/usr/local/atlas-3.9.16/lib'] language = f77 include_dirs = ['/usr

Speeding up numpy.dot

夙愿已清 提交于 2020-01-22 07:31:31
问题 I've got a numpy script that spends about 50% of its runtime in the following code: s = numpy.dot(v1, v1) where v1 = v[1:] and v is a 4000-element 1D ndarray of float64 stored in contiguous memory ( v.strides is (8,) ). Any suggestions for speeding this up? edit This is on Intel hardware. Here is the output of my numpy.show_config() : atlas_threads_info: libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/usr/local/atlas-3.9.16/lib'] language = f77 include_dirs = ['/usr

Numpy: Dot product with max instead of sum

北城余情 提交于 2020-01-03 20:23:10
问题 Is there a way in numpy to do the following (or is there a general mathematical term for this): Assume normal dot product: M3[i,k] = sum_j(M1[i,j] * M2[j,k]) Now I would like to replace the sum by sum other operation, say the maximum: M3[i,k] = max_j(M1[i,j] * M2[j,k]) As you can see it is completely parallel to the above, just we take max over all j and not the sum. Other options could be min , prod , and whatever other operation that turns a sequence/set into a value. 回答1: Normal dot

What is the pythonic way to calculate dot product?

你说的曾经没有我的故事 提交于 2019-12-28 09:06:33
问题 I have two lists, one is named as A, another is named as B. Each element in A is a triple, and each element in B is just an number. I would like to calculate the result defined as : result = A[0][0] * B[0] + A[1][0] * B[1] + ... + A[n-1][0] * B[n-1] I know the logic is easy but how to write in pythonic way? Thanks! 回答1: import numpy result = numpy.dot( numpy.array(A)[:,0], B) http://docs.scipy.org/doc/numpy/reference/ If you want to do it without numpy, try sum( [a[i][0]*b[i] for i in range

What is the recommended way to compute a weighted sum of selected columns of a pandas dataframe?

拟墨画扇 提交于 2019-12-23 19:31:07
问题 For example, I would like to compute the weighted sum of columns 'a' and 'c' for the below matrix, with weights defined in the dictionary w . df = pd.DataFrame({'a': [1,2,3], 'b': [10,20,30], 'c': [100,200,300], 'd': [1000,2000,3000]}) w = {'a': 1000., 'c': 10.} I figured out some options myself (see below), but all look a bit complicated. Isn't there a direct pandas operation for this basic use-case? Something like df.wsum(w) ? I tried pd.DataFrame.dot, but it raises a value error: df.dot(pd

3D space: following the direction that an object is pointing towards, using the mouse pointer

纵饮孤独 提交于 2019-12-22 08:41:52
问题 Given the 3D vector of the direction that the camera is facing and the orientation/direction vector of a 3D object in the 3D space, how can I calculate the 2-dimensional slope that the mouse pointer must follow on the screen in order to visually be moving along the direction of said object? Basically I'd like to be able to click on an arrow and make it move back and forth by dragging it, but only if the mouse pointer drags (roughly) along the length of the arrow, i.e. in the direction that it

Get dot-product of dataframe with vector, and return dataframe, in Pandas

◇◆丶佛笑我妖孽 提交于 2019-12-21 09:29:06
问题 I am unable to find the entry on the method dot() in the official documentation. However the method is there and I can use it. Why is this? On this topic, is there a way compute an element-wise multiplication of every row in a data frame with another vector? (and obtain a dataframe back?), i.e. similar to dot() but rather than computing the dot product, one computes the element-wise product. 回答1: Here is an example of how to multiply a DataFrame by a vector: In [60]: df = pd.DataFrame({'A':

How is convolution done with RGB channel?

感情迁移 提交于 2019-12-21 03:35:28
问题 Say we have a single channel image (5x5) A = [ 1 2 3 4 5 6 7 8 9 2 1 4 5 6 3 4 5 6 7 4 3 4 5 6 2 ] And a filter K (2x2) K = [ 1 1 1 1 ] An example of applying convolution (let us take the first 2x2 from A) would be 1*1 + 2*1 + 6*1 + 7*1 = 16 This is very straightforward. But let us introduce a depth factor to matrix A i.e., RGB image with 3 channels or even conv layers in a deep network (with depth = 512 maybe). How would the convolution operation be done with the same filter ? A similiar

NumPy: Dot product for many small matrices at once

青春壹個敷衍的年華 提交于 2019-12-20 04:28:12
问题 I have a long array of 3-by-3 matrices, e.g., import numpy as np A = np.random.rand(25, 3, 3) and for each of the small matrices, I would like to perform an outer product dot(a, a.T) . The list comprehension import numpy as np B = np.array([ np.dot(a, a.T) for a in A ]) works, but doesn't perform well. A possible improvement could be to do just one big dot product, but I'm having troubles here setting up A correctly for it. Any hints? 回答1: You can obtain the list of transposed matrices as A