matrix-multiplication | 易学教程

Efficient Algorithms for Computing a matrix times its transpose [closed]

阅读更多关于 Efficient Algorithms for Computing a matrix times its transpose [closed]

问题 Closed . This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 4 months ago . For a class, a question that was posed by my teacher was the algorithmic cost of multiplying a matrix times its transpose. With the standard 3 loop matrix multiplication algorithm, the efficiency is O(N^3), and I wonder if there was a way to manipulate or take advantage of matrix

Is there a Java library for better linear regression? (E.g., iteratively reweighted least squares) [closed]

阅读更多关于 Is there a Java library for better linear regression? (E.g., iteratively reweighted least squares) [closed]

I am struggling to find a way to perform better linear regression. I have been using the Moore-Penrose pseudoinverse and QR decomposition with JAMA library , but the results are not satisfactory. Would ojAlgo be useful? I have been hitting accuracy limits that I know should not be there. The algorithm should be capable of reducing the impact of an input variable to zero. Perhaps this takes the form of iteratively reweighted least squares, but I do not know that algorithm and cannot find a library for it. The output should be a weight matrix or vector such that matrix multiplication of the

Poor maths performance in C vs Python/numpy

阅读更多关于 Poor maths performance in C vs Python/numpy

Near-duplicate / related: How does BLAS get such extreme performance? (If you want fast matmul in C, seriously just use a good BLAS library unless you want to hand-tune your own asm version.) But that doesn't mean it's not interesting to see what happens when you compile less-optimized matrix code. how to optimize matrix multiplication (matmul) code to run fast on a single processor core Matrix Multiplication with blocks Out of interest, I decided to compare the performance of (inexpertly) handwritten C vs. Python/numpy performing a simple matrix multiplication of two, large, square matrices

How to do R multiplication with Nx1 1xM for Matrix NxM?

阅读更多关于 How to do R multiplication with Nx1 1xM for Matrix NxM?

I want to do a simple column (Nx1) times row (1xM) multiplication, resulting in (NxM) matrix. Code where I create a row by sequence, and column by transposing a similar sequence row1 <- seq(1:6) col1 <- t(seq(1:6)) col1 * row1 Output which indicates that R thinks matrices more like columns [,1] [,2] [,3] [,4] [,5] [,6] [1,] 1 4 9 16 25 36 Expected output: NxM matrix. OS: Debian 8.5 Linux kernel: 4.6 backports Hardware: Asus Zenbook UX303UA In this case using outer would be a more natural choice outer(1:6, 1:6) In general for two numerical vectors x and y , the matrix rank-1 operation can be

How to multiply a matrix in C#?

阅读更多关于 How to multiply a matrix in C#?

I cannot get this method to work. It intends to multiply a matrix by a given one. Could someone help me to correct it please? class Matriz { public double[,] structure; //Other class methods public void multiplyBy(Matrix m) { if (this.structure.GetLength(1) == m.structure.GetLength(0)) { Matriz resultant = new Matriz(this.structure.GetLength(0), m.structure.GetLength(1)); for (int i = 0; i < this.structure.GetLength(0) - 1; i++) { for (int j = 0; j < m.structure.GetLength(1) - 1; j++) { resultant.structure[i, j] = 0; for (int z = 0; z < this.structure.GetLength(1) - 1; z++) { resultant

compute only diagonals of matrix multiplication in R

阅读更多关于 compute only diagonals of matrix multiplication in R

问题 I need only the diagonal elements from a matrix multiplication: , in R. As Z is huge I want to avoid the full out multiplication.... Z <- matrix(c(1,1,1,2,3,4), ncol = 2) Z # [,1] [,2] #[1,] 1 2 #[2,] 1 3 #[3,] 1 4 X <- matrix(c(10,-5,-5,20), ncol = 2) X # [,1] [,2] #[1,] 10 -5 #[2,] -5 20 Z %*% D %*% t(Z) # [,1] [,2] [,3] #[1,] 70 105 140 #[2,] 105 160 215 #[3,] 140 215 290 diag(Z %*% D %*% t(Z)) #[1] 70 160 290 X is always a small square matrix (2x2 , 3x3 or 4x4), where Z will have the

Laderman's 3x3 matrix multiplication with only 23 multiplications, is it worth it?

阅读更多关于 Laderman's 3x3 matrix multiplication with only 23 multiplications, is it worth it?

问题 Take the product of two 3x3 matrices A*B=C . Naively this requires 27 multiplications using the standard algorithm. If one were clever, you could do this using only 23 multiplications, a result found in 1973 by Laderman. The technique involves saving intermediate steps and combining them in the right way. Now lets fix a language and a type, say C++ with elements of double . If the Laderman algorithm was hard-coded versus the simple double loop, could we expect the performance of a modern

Why can't my CPU maintain peak performance in HPC

阅读更多关于 Why can't my CPU maintain peak performance in HPC

问题 I have developed a high performance Cholesky factorization routine, which should have peak performance at around 10.5 GFLOPs on a single CPU (without hyperthreading). But there is some phenomenon which I don't understand when I test its performance. In my experiment, I measured the performance with increasing matrix dimension N, from 250 up to 10000. In my algorithm I have applied caching (with tuned blocking factor), and data are always accessed with unit stride during computation, so cache

Improving the performance of Matrix Multiplication

阅读更多关于 Improving the performance of Matrix Multiplication

This is my code for speeding up matrix multiplication, but it is only 5% faster than the simple one. What can i do to boost it as much as possible? *The tables are being accessed for example as: C[sub2ind(i,j,n)] for the C[i, j] position. void matrixMultFast(float * const C, /* output matrix */ float const * const A, /* first matrix */ float const * const B, /* second matrix */ int const n, /* number of rows/cols */ int const ib, /* size of i block */ int const jb, /* size of j block */ int const kb) /* size of k block */ { int i=0, j=0, jj=0, k=0, kk=0; float sum; for(i=0;i<n;i++) for(j=0;j<n

Cache friendly method to multiply two matrices

阅读更多关于 Cache friendly method to multiply two matrices

I intend to multiply 2 matrices using the cache-friendly method ( that would lead to less number of misses) I found out that this can be done with a cache friendly transpose function. But I am not able to find this algorithm. Can I know how to achieve this? The word you are looking for is thrashing . Searching for thrashing matrix multiplication in Google yields more results . A standard multiplication algorithm for c = a*b would look like void multiply(double[,] a, double[,] b, double[,] c) { for (int i = 0; i < n; i++) for (int j = 0; j < n; j++) for (int k = 0; k < n; k++) C[i, j] += a[i, k