How to transpose a matrix in CUDA/cublas?

前端 未结 3 1190
时光取名叫无心
时光取名叫无心 2020-12-17 02:54

Say I have a matrix with a dimension of A*B on GPU, where B (number of columns) is the leading dimension assuming a C style. Is there any method in

相关标签:
3条回答
  • 2020-12-17 03:29

    The version of CUBLAS bundled with the CUDA 5 toolkit contains a BLAS-like method (cublasgeam) that could be used to transpose a matrix. It's documented here.

    0 讨论(0)
  • 2020-12-17 03:41

    The CUDA SDK includes a matrix transpose, you can see here examples of code on how to implement one, ranging from a naive implementation to optimized versions.

    For example:

    Naïve transpose

    __global__ void transposeNaive(float *odata, float* idata,
    int width, int height, int nreps)
    {
        int xIndex = blockIdx.x*TILE_DIM + threadIdx.x;
        int yIndex = blockIdx.y*TILE_DIM + threadIdx.y;
        int index_in = xIndex + width * yIndex;
        int index_out = yIndex + height * xIndex;
    
        for (int r=0; r < nreps; r++)
        {
            for (int i=0; i<TILE_DIM; i+=BLOCK_ROWS)
            {
              odata[index_out+i] = idata[index_in+i*width];
            }
        }
    }
    

    Like talonmies had point out you can specify if you want operate the matrix as transposed or not, in cublas matrix operations eg.: for cublasDgemm() where C = a * op(A) * op(B) + b * C, assuming you want to operate A as transposed (A^T), on the parameters you can specify if it is ('N' normal or 'T' transposed)

    0 讨论(0)
  • 2020-12-17 03:43

    as asked within the title, to transpose a device row-major matrix A[m][n], one can do it this way:

        float* clone = ...;//copy content of A to clone
        float const alpha(1.0);
        float const beta(0.0);
        cublasHandle_t handle;
        cublasCreate(&handle);
        cublasSgeam( handle, CUBLAS_OP_T, CUBLAS_OP_N, m, n, &alpha, clone, n, &beta, clone, m, A, m );
        cublasDestroy(handle);
    

    And, to multiply two row-major matrices A[m][k] B[k][n], C=A*B

        cublasSgemm( handle, CUBLAS_OP_N, CUBLAS_OP_N, n, m, k, &alpha, B, n, A, k, &beta, C, n );
    

    where C is also a row-major matrix.

    0 讨论(0)
提交回复
热议问题