low RAM consuming c++ eigen solver

两盒软妹~` 提交于 2019-12-04 11:41:36

Both Lapackpp and Armadillo rely on Lapack to compute the eigenvalues and eigenvectors of complex matrices. The Lapack library provides different way to perform these operations for complex hermitian matrices.

  • The function zgeev() does not care about the matrix being Hermitian. This function is called by the Lapackpp library for matrices of type LaGenMatComplex in the function LaEigSolve. The function eig_gen() of the Armadillo library calls this function.

  • The function zheev() is dedicated to complex Hermitian matrices. It first call ZHETRD to reduce Hermitian matrix to tridiagonal form. Depending on whether the eigenvectors are needed, it then uses a QR algorithm to compute the eigenvalues and eigenvectors (if needed). The function eig_sym() of the Armadillo library call this function if the method std is selected.

  • The function zheevd() does the same thing as zheev() if the eigenvectors are not required. Otherwise, it makes use of a divide and conquert algorithm (see zstedc()). The function eig_sym() of the Armadillo library call this function if the method dc is selected. Since the divide and conquer proved to be faster for large matrices, it is now the default method.

Functions with more options are available in the Lapack library. (see zheevr() or zheevx). If you want to keep a dense matrix format, you can also try the ComplexEigenSolver of the Eigen library.

Here is a little C++ test using the C wrapper LAPACKE of the Lapack library. It is compiled by g++ main.cpp -o main2 -L /home/...../lapack-3.5.0 -llapacke -llapack -lblas

#include <iostream>

#include <complex>
#include <ctime>
#include <cstring>

#include "lapacke.h"

#undef complex
using namespace std;

int main()
{
    //int n = 3432;

    int n = 600;

    std::complex<double> *matrix=new std::complex<double>[n*n];
    memset(matrix, 0, n*n*sizeof(std::complex<double>));
    std::complex<double> *matrix2=new std::complex<double>[n*n];
    memset(matrix2, 0, n*n*sizeof(std::complex<double>));
    std::complex<double> *matrix3=new std::complex<double>[n*n];
    memset(matrix3, 0, n*n*sizeof(std::complex<double>));
    std::complex<double> *matrix4=new std::complex<double>[n*n];
    memset(matrix4, 0, n*n*sizeof(std::complex<double>));
    for(int i=0;i<n;i++){
        matrix[i*n+i]=42;
        matrix2[i*n+i]=42;
        matrix3[i*n+i]=42;
        matrix4[i*n+i]=42;
    }

    for(int i=0;i<n-1;i++){
        matrix[i*n+(i+1)]=20;
        matrix2[i*n+(i+1)]=20;
        matrix3[i*n+(i+1)]=20;
        matrix4[i*n+(i+1)]=20;

        matrix[(i+1)*n+i]=20;
        matrix2[(i+1)*n+i]=20;
        matrix3[(i+1)*n+i]=20;
        matrix4[(i+1)*n+i]=20;
    }

    double* w=new double[n];//eigenvalues

    //the lapack function zheev
    clock_t t;
    t = clock();
    LAPACKE_zheev(LAPACK_COL_MAJOR,'V','U', n,reinterpret_cast< __complex__ double*>(matrix), n, w);
    t = clock() - t;
    cout<<"zheev : "<<((float)t)/CLOCKS_PER_SEC<<" seconds"<<endl;
    cout<<"largest eigenvalue="<<w[n-1]<<endl;

    std::complex<double> *wc=new std::complex<double>[n];
    std::complex<double> *vl=new std::complex<double>[n*n];
    std::complex<double> *vr=new std::complex<double>[n*n];

    t = clock();
    LAPACKE_zgeev(LAPACK_COL_MAJOR,'V','V', n,reinterpret_cast< __complex__ double*>(matrix2), n, reinterpret_cast< __complex__ double*>(wc),reinterpret_cast< __complex__ double*>(vl),n,reinterpret_cast< __complex__ double*>(vr),n);
    t = clock() - t;
    cout<<"zgeev : "<<((float)t)/CLOCKS_PER_SEC<<" seconds"<<endl;
    cout<<"largest eigenvalue="<<wc[0]<<endl;

    t = clock();
    LAPACKE_zheevd(LAPACK_COL_MAJOR,'V','U', n,reinterpret_cast< __complex__ double*>(matrix3), n, w);
    t = clock() - t;
    cout<<"zheevd : "<<((float)t)/CLOCKS_PER_SEC<<" seconds"<<endl;
    cout<<"largest eigenvalue="<<w[n-1]<<endl;

    t = clock();
    LAPACKE_zheevd(LAPACK_COL_MAJOR,'N','U', n,reinterpret_cast< __complex__ double*>(matrix4), n, w);
    t = clock() - t;
    cout<<"zheevd (no vector) : "<<((float)t)/CLOCKS_PER_SEC<<" seconds"<<endl;
    cout<<"largest eigenvalue="<<w[n-1]<<endl;

    delete[] w;
    delete[] wc;
    delete[] vl;
    delete[] vr;
    delete[] matrix;
    delete[] matrix2;
    return 0;
}

The output I have of my computer is :

zheev : 2.79 seconds
largest eigenvalue=81.9995
zgeev : 10.74 seconds
largest eigenvalue=(77.8421,0)
zheevd : 0.44 seconds
largest eigenvalue=81.9995
zheevd (no vector) : 0.02 seconds
largest eigenvalue=81.9995

These tests could have been performed by using the Armadillo library. Calling directly the Lapack library might allow you to gain some memory, but wrappers of Lapack can be efficient on this aspect as well.

The real question is whether you need all the eigenvectors, all the eigenvalues or only the largest eigenvalues. Because there are really efficient methods in the last case. Take a look at the Arnoldi/Lanczos itterative algorithms. Huge memory gains are possible if the matrix is sparce since only matrix-vector products are performed : there is no need to keep the dense format. This is what is done in the SlepC library, which makes use of the sparce matrix formats of Petsc. Here is an example of Slepc which can be used as a starting point.

If someone has same problem as I in the future there are some tips after I found the solution (thank for all that posted some answers or clues!).

On the intel site you can found some nice examples which are written in Fortran and C. For example hermitian eigenvalue problem routine zheev(): https://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_lapack_examples/zheev_ex.c.htm

To make it works in C++ you should edit some lines in that code:

In prototype functions declaration do similar for all functions: extern void zheev( ... ) change to extern "C" {void zheev( ... )}

Change calling lapack's function adding _ symbol for example: zheev( ... ) to zheev_( ... ) (make it for all in code simply by substitution, but I don't why it works. I figured it out by making some experiments.)

Optionally you can convert printf function to a standard stream std::cout and replace included headers stdio.h to iostream.

To compile run command like: g++ test.cpp -o test -llapack -lgfortran -lm -Wno-write-strings

About the last option -Wno-write-strings, I don't now what it's doing, but probably there're problems with intel's example when they put string not char in calling function zheev( "Vectors", "Lower", ... )

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!