matrix-multiplication | 易学教程

Rotating a vector using Matrix.rotateM

阅读更多关于 Rotating a vector using Matrix.rotateM

问题 I have made a simple class called Vector3. It's a 3 dimensional vector with some basic math implementions. Now i want to be able to rotate this single vector, but i get an exception. I have this: private static final float[] matrix = new float[16]; private static final float[] inVec = new float[4]; private static final float[] outVec = new float[4]; public Vector3 rotate(float angle, float axisX, float axisY, float axisZ) { inVec[0] = x; inVec[1] = y; inVec[2] = z; inVec[3] = 1; Matrix

Armadillo sparse real matrix multiplication with complex vector

阅读更多关于 Armadillo sparse real matrix multiplication with complex vector

问题 I'm trying to multiply a sparse real matrix with a complex vector but the program does not compile. If I change the vector to real or the matrix to dense, then everything goes through. A sample code is: #define ARMA_64BIT_WORD #include <armadillo> #include <iostream> #include <stdio.h> #include <math.h> using namespace arma; int main(){ size_t n(5); vec vR(randu<vec>(n)), vI(randu<vec>(n)); //Create random complex vector 'v' cx_vec v(vR, vI); std::cout<<"\n\tMultiplying real matrix with

Most efficient matrix multiplication in C using fork() and IPC

阅读更多关于 Most efficient matrix multiplication in C using fork() and IPC

问题 I need to implement concurrent matrix multiplication in C using multiple processes. I understand that because each process has its own private address space, I will have to use some form of interprocess communication (IPC). I did some looking around and couldn't find many implementations that didn't use threads. I was wondering if anyone knew the most best way to go about this, either using shared memory, message passing, or pipes? I am not asking for a solution, but rather, if anyone knows,

Issues multiplying Mat matrices

阅读更多关于 Issues multiplying Mat matrices

问题 I am trying to project an image to eigenface convariance matrix that EigenFacesRecognizer of opencv returns. I use the following code to load eigenfaces parameters loading an image and trying to project the sample image to pca subspace. Ptr<FaceRecognizer> model = createEigenFaceRecognizer(); model->load("eigenfaces.yml"); // Load eigenfaces parameters Mat eigenvalues = model->getMat("eigenvalues"); // Eigen values of PCA Mat convMat = model->getMat("eigenvectors"); //Convariance matrix Mat

Matrix multiplication on GPU. Memory bank conflicts and latency hiding

阅读更多关于 Matrix multiplication on GPU. Memory bank conflicts and latency hiding

问题 Edit: achievements over time is listed at the end of this question(~1Tflops/s yet). Im writing some kind of math library for C# using opencl(gpu) from C++ DLL and already done some optimizations on single precision square matrix-matrix multiplicatrion(for learning purposes and possibility of re-usage in a neural-network program later). Below kernel code gets v1 1D array as rows of matrix1(1024x1024) and v2 1D array as columns of matrix2((1024x1024)transpose optimization) and puts the result

reading/writing a matrix with a stride much larger than its width causes a big loss in performance

阅读更多关于 reading/writing a matrix with a stride much larger than its width causes a big loss in performance

问题 I'm doing dense matrix multiplication on 1024x1024 matrices. I do this using loop blocking/tiling using 64x64 tiles. I have created a highly optimized 64x64 matrix multiplication function (see the end of my question for the code). gemm64(float *a, float *b, float *c, int stride). Here is the code which runs over the tiles. A 1024x1204 matrix which has 16x16 tiles. for(int i=0; i<16; i++) { for(int j=0; j<16; j++) { for(int k=0; k<16; k++) { gemm64(&a[64*(i*1024 + k)], &b[64*(k*1024 + j)], &c

Cache management for sparse matrix multiplication using OpenMP

阅读更多关于 Cache management for sparse matrix multiplication using OpenMP

问题 I am having issues with what I think is some false caching, I am only getting a small speedup when using the following code compared to not the unparalleled version. matrix1 and matrix2 are sparse matrices in a struct with (row, col, val) format. void pMultiply(struct SparseRow *matrix1, struct SparseRow *matrix2, int m1Rows, int m2Rows, struct SparseRow **result) { *result = malloc(1 * sizeof(struct SparseRow)); int resultNonZeroEntries = 0; #pragma omp parallel for atomic for(int i = 0; i <

When is 'crossprod' preferred to '%*%', and when isn't?

阅读更多关于 When is 'crossprod' preferred to '%*%', and when isn't?

问题 When exactly is crossprod(X,Y) preferred to t(X) %*% Y when X and Y are both matrices? The documentation says Given matrices x and y as arguments, return a matrix cross-product. This is formally equivalent to (but usually slightly faster than) the call t(x) %*% y ( crossprod ) or x %*% t(y) ( tcrossprod ). So when is it not faster? When searching online I found several sources that stated either that crossprod is generally preferred and should be used as default (e.g. here), or that it

Matrix to EulerAngles

阅读更多关于 Matrix to EulerAngles

问题 I'm trying to extract euler angles from a rotation matrix. My convetions: Matrix column-major, Coordinate System right-handed, Positive Angle right-handed, Rotation Order YXZ (first heading, then attitude, then bank) I've found this, but couldn't use it because they use other axes orders: (http://www.euclideanspace.com/maths/geometry/rotations/conversions/matrixToEuler/index.htm) /** this conversion uses conventions as described on page: * http://www.euclideanspace.com/maths/geometry

Extract the minor matrix from a 3x3 based on input i,j

阅读更多关于 Extract the minor matrix from a 3x3 based on input i,j

问题 For a given 3x3 matrix, for example: A = [3 1 -4 ; 2 5 6 ; 1 4 8] If I need the minor matrix for entry (1,2) Minor = [2 6 ; 1 8] I already wrote a program to read in the matrix from a text file, and I am supposed to write a subroutine to extract the minor matrix from the main matrix A based on the user inputs for i,j. I am very new to Fortran and have no clue how to do that. I made some very desperate attempts but I am sure there is a cleaner way to do that. I got so desperate I wrote 9 if