matrix-multiplication | 易学教程

Efficient implementation of a sequence of matrix-vector products / specific “tensor”-matrix product

阅读更多关于 Efficient implementation of a sequence of matrix-vector products / specific “tensor”-matrix product

问题 I have a special algorithm where as one of the lasts steps I need to carry out a multiplication of a 3-D array with a 2-D array such that each matrix-slice of the 3-D array is multiplied wich each column of the 2-D array. In other words, if, say A is an N x N x N matrix and B is an N x N matrix, I need to compute a matrix C of size N x N where C(:,i) = A(:,:,i)*B(:,i); . The naive way to implement this is a loop, i.e., C = zeros(N,N); for i = 1:N C(:,i) = A(:,:,i)*B(:,i); end However, loops

Numpy Vectorization of sliding-window operation

阅读更多关于 Numpy Vectorization of sliding-window operation

问题 I have the following numpy arrays: arr_1 = [[1,2],[3,4],[5,6]] # 3 X 2 arr_2 = [[0.5,0.6],[0.7,0.8],[0.9,1.0],[1.1,1.2],[1.3,1.4]] # 5 X 2 arr_1 is clearly a 3 X 2 array, whereas arr_2 is a 5 X 2 array. Now without looping, I want to element-wise multiply arr_1 and arr_2 so that I apply a sliding window technique (window size 3) to arr_2. Example: Multiplication 1: np.multiply(arr_1,arr_2[:3,:]) Multiplication 2: np.multiply(arr_1,arr_2[1:4,:]) Multiplication 3: np.multiply(arr_1,arr_2[2:5,:]

Matrix-vector-multiplication in AVX not proportionately faster than in SSE

阅读更多关于 Matrix-vector-multiplication in AVX not proportionately faster than in SSE

问题 I was writing a matrix-vector-multiplication in both SSE and AVX using the following: for(size_t i=0;i<M;i++) { size_t index = i*N; __m128 a, x, r1; __m128 sum = _mm_setzero_ps(); for(size_t j=0;j<N;j+=4,index+=4) { a = _mm_load_ps(&A[index]); x = _mm_load_ps(&X[j]); r1 = _mm_mul_ps(a,x); sum = _mm_add_ps(r1,sum); } sum = _mm_hadd_ps(sum,sum); sum = _mm_hadd_ps(sum,sum); _mm_store_ss(&C[i],sum); } I used a similar method for AVX, however at the end, since AVX doesn't have an equivalent

Which is the best way to multiply a large and sparse matrix with its transpose?

阅读更多关于 Which is the best way to multiply a large and sparse matrix with its transpose?

问题 I currently want to multiply a large sparse matrix(~1M x 200k) with its transpose. The values of the resulting matrix would be in float. I tried loading the matrix in scipy's sparse matrix and by multiplying each row of first matrix with the second matrix. The multiplication took ~2hrs to complete. What is the efficient way to achieve this multiplication? Because I see a pattern in the computation. The matrix being large and sparse . The multiplication of a matrix with its transpose. So, the

Scipy LinearOperator With Multiple Inputs

阅读更多关于 Scipy LinearOperator With Multiple Inputs

问题 I need to invert a large, dense matrix which I hoped to use Scipy's gmres to do. Fortunately, the dense matrix A follows a pattern and I do not need to store the matrix in memory. The LinearOperator class allows us to construct an object which acts as the matrix for GMRES and can compute directly the matrix vector product A*v . That is, we write a function mv(v) which takes as input a vector v and returns mv(v) = A*v . Then, we can use the LinearOperator class to create A_LinOp =

Null Space Binary Matrix : Java

阅读更多关于 Null Space Binary Matrix : Java

Here is my question: How to calcul the Kernel of a Binary Matrix ?? To calcul the Kernel (or Null Space if you prefer) in Java, it's pretty simple in the Real Space, there is already a lot of classes, so don't need to invent the wheel again, we just have to use them ! double[][] data = new double[3][3]; // ... fill the Matrix SimpleMatrix m = new SimpleMatrix(data); SimpleSVD svd = m.svd(); SimpleMatrix nullSpace = svd.nullSpace(); nullSpace.print(); (Theses classes came from : http://efficient-java-matrix-library.googlecode.com/svn-history/r244/javadoc/ver0.14/org/ejml/data/package-summary

Broadcasting np.dot vs tf.matmul for tensor-matrix multiplication (Shape must be rank 2 but is rank 3 error)

阅读更多关于 Broadcasting np.dot vs tf.matmul for tensor-matrix multiplication (Shape must be rank 2 but is rank 3 error)

Let's say I have the following tensors: X = np.zeros((3,201, 340)) Y = np.zeros((340, 28)) Making a dot product of X, Y is successful with numpy, and yields a tensor of shape (3, 201, 28). However with tensorflow I get the following error: Shape must be rank 2 but is rank 3 error ... minimal code example: X = np.zeros((3,201, 340)) Y = np.zeros((340, 28)) print(np.dot(X,Y).shape) # successful (3, 201, 28) tf.matmul(X, Y) # errornous Any idea how to achieve the same result with tensorflow? Since, you are working with tensors , it would be better (for performance) to use tensordot there than np

MATLAB: Block matrix multiplying without loops

阅读更多关于 MATLAB: Block matrix multiplying without loops

I have a block matrix [A B C...] and a matrix D (all 2-dimensional). D has dimensions y-by-y, and A, B, C , etc are each z-by-y. Basically, what I want to compute is the matrix [D*(A'); D*(B'); D*(C');...] , where X ' refers to the transpose of X . However, I want to accomplish this without loops for speed considerations. I have been playing with the reshape command for several hours now, and I know how to use it in other cases, but this use case is different from the other ones and I cannot figure it out. I also would like to avoid using multi-dimensional matrices if at all possible. Honestly

SSE matrix-matrix multiplication

阅读更多关于 SSE matrix-matrix multiplication

I'm having trouble doing matrix-matrix multiplication with SSE in C. Here is what I got so far: #define N 1000 void matmulSSE(int mat1[N][N], int mat2[N][N], int result[N][N]) { int i, j, k; __m128i vA, vB, vR; for(i = 0; i < N; ++i) { for(j = 0; j < N; ++j) { vR = _mm_setzero_si128(); for(k = 0; k < N; k += 4) { //result[i][j] += mat1[i][k] * mat2[k][j]; vA = _mm_loadu_si128((__m128i*)&mat1[i][k]); vB = _mm_loadu_si128((__m128i*)&mat2[k][j]); //how well does the k += 4 work here? Should it be unrolled? vR = _mm_add_epi32(vR, _mm_mul_epi32(vA, vB)); } vR = _mm_hadd_epi32(vR, vR); vR = _mm_hadd

How do I rotate an arkit 4x4 matrix around Y using Apple's SIMD library?

阅读更多关于 How do I rotate an arkit 4x4 matrix around Y using Apple's SIMD library?

I am trying to implement some code based on an ARKit demo where someone used this helper function to place a waypoint let rotationMatrix = MatrixHelper.rotateAboutY( degrees: bearing * -1 ) How can I implement the .rotateAboutY function using the SIMD library and not using GLKit? To make it easier, I could start from the origin point. I'm not too handy with the matrix math so a more basic explanation would be helpful. The rotation around Y matrix is: | cos(angle) 0 sin(angle)| | 0 1 0 | |-sin(angle) 0 cos(angle)| Rotation counter-clockwise around Y: |cos(angle) 0 -sin(angle)| | 0 1 0 | |sin