How to optimize matrix multiplication operation [duplicate]

问题

I need to perform a lot of matrix operations in my application. The most time consuming is matrix multiplication. I implemented it this way

template<typename T>
Matrix<T> Matrix<T>::operator * (Matrix& matrix)
{


    Matrix<T> multipliedMatrix = Matrix<T>(this->rows,matrix.GetColumns(),0);

    for (int i=0;i<this->rows;i++)
    {
        for (int j=0;j<matrix.GetColumns();j++)
        {
            multipliedMatrix.datavector.at(i).at(j) = 0;
            for (int k=0;k<this->columns ;k++)
            {
                multipliedMatrix.datavector.at(i).at(j) +=  datavector.at(i).at(k) * matrix.datavector.at(k).at(j);
            }
            //cout<<(*multipliedMatrix)[i][j]<<endl;
        }
    }
    return multipliedMatrix;
}

Is there any way to write it in a better way?? So far matrix multiplication operations take most of time in my application. Maybe is there good/fast library for doing this kind of stuff ?? However I rather can't use libraries which uses graphic card for mathematical operations, because of the fact that I work on laptop with integrated graphic card.

回答1:

Eigen is by far one of the fastest, if not the fastest, linear algebra libraries out there. It is well written and it is of high quality. Also, it uses expression template which makes writing code that is more readable. Version 3 just released uses OpenMP for data parallelism.

#include <iostream>
#include <Eigen/Dense>

using Eigen::MatrixXd;

int main()
{
  MatrixXd m(2,2);
  m(0,0) = 3;
  m(1,0) = 2.5;
  m(0,1) = -1;
  m(1,1) = m(1,0) + m(0,1);
  std::cout << m << std::endl;
}

回答2:

Boost uBLAS I think is definitely the way to go with this sort of thing. Boost is well designed, well tested and used in a lot of applications.

回答3:

Consider GNU Scientific Library, or MV++

If you're okay with C, BLAS is a low-level library that incorporates both C and C-wrapped FORTRAN instructions and is used a huge number of higher-level math libraries.

I don't know anything about this, but another option might be Meschach which seems to have decent performance.

Edit: With respect to your comment about not wanting to use libraries that use your graphics card, I'll point out that in many cases, the libraries that use your graphics card are specialized implementations of standard (non-GPU) libraries. For example, various implementations of BLAS are listed on it's Wikipedia page, only some are designed to leverage your GPU.

回答4:

There is a book called Introduction to Algorithms. You may like to check the chapter of Dynamic Programming. It has an excellent matrix multiplication algo using dynamic programming. Its worth a read. Well, this info was in case you want to write your own logic instead of using a library.

回答5:

There are plenty of algorithms for efficient matrix multiplication.

Algorithms for efficient matrix multiplication

Look at the algorithms, find an implementations.

You can also make a multi-threaded implementation for it.

回答6:

What I'd do is reduce the number of at(i) operators being called. For instance in this loop:

for (int i=0;i<this->rows;i++)     
{        
    for (int j=0;j<matrix.GetColumns();j++)  
    {          
         multipliedMatrix.datavector.at(i).at(j) = 0;     
         for (int k=0;k<this->columns ;k++)          
         {               
               multipliedMatrix.datavector.at(i).at(j) +=  datavector.at(i).at(k) * matrix.datavector.at(k).at(j);            
         } 
     }
 }

You're wasting a lot of time by performing the at(i) operator inside every j and every k loop.

What I'd do instead is:

for (int i=0;i<this->rows;i++)     
{   
    // I don't know the type of this object, but let's call it type MatrixRow     
    MatrixRow & mmi = multipliedMatrix.datavector.at(i);
    MatrixRow & dvi = datavector.at(i);
    for (int j=0;j<matrix.GetColumns();j++)  
    {          
         // I don't know the type of this either, but let's say it's a double
         double &mmij  = mmi.at(j);
         mmij = 0;
         for (int k=0;k<this->columns ;k++)          
         {               
               mmij +=  dvi.at(k) * matrix.datavector.at(k).at(j);            
         } 
     }
 }

The above suggestions might not be syntatically correct, but you get the idea.

Also, if your memory is contiguous allocated, you can get even further speedups, by not doing lookups for each j and each k, but instead using the appropriate pointer increments.

Also, the array boundaries might be inefficient since these lookups are being called a lot and each time a function is being called or a dereference is being done. That is this->rows, matrix.GetColumns(), and this->columns could be stored in appropriate integers. This might improve speed a lot.

来源：https://stackoverflow.com/questions/6061921/how-to-optimize-matrix-multiplication-operation

标签

c++

matrix

matrix-multiplication