R: Convert upper triangular part of a matrix to symmetric matrix

前端 未结 1 945
栀梦
栀梦 2020-12-20 18:27

I have the upper triangular part of matrix in R (without diagonal) and want to generate a symmetric matrix from the upper triangular part (with 1 on the diagonal but that ca

相关标签:
1条回答
  • 2020-12-20 19:23

    I think there are two issues here.

    now I want to work with very large matrices

    Then do not use R code to do this job. R will use much more memory than you expect. Try the following code:

    res.upper <- rnorm(4950)
    res <- matrix(0, 100, 100)
    tracemem(res)  ## trace memory copies of `res`
    res[upper.tri(res)] <- res.upper
    rm(res.upper)
    diag(res) <- 1
    res[lower.tri(res)]  <- t(res)[lower.tri(res)]
    

    This is what you will get:

    > res.upper <- rnorm(4950)  ## allocation of length 4950 vector
    > res <- matrix(0, 100, 100)  ## allocation of 100 * 100 matrix
    > tracemem(res)
    [1] "<0xc9e6c10>"
    > res[upper.tri(res)] <- res.upper
    tracemem[0xc9e6c10 -> 0xdb7bcf8]: ## allocation of 100 * 100 matrix
    > rm(res.upper)
    > diag(res) <- 1
    tracemem[0xdb7bcf8 -> 0xdace438]: diag<-  ## allocation of 100 * 100 matrix
    > res[lower.tri(res)]  <- t(res)[lower.tri(res)]
    tracemem[0xdace438 -> 0xdb261d0]: ## allocation of 100 * 100 matrix
    tracemem[0xdb261d0 -> 0xccc34d0]: ## allocation of 100 * 100 matrix
    

    In R, you have to use 5 * (100 * 100) + 4950 double words to finish these operations. While in C, you only need at most 4950 + 100 * 100 double words (In fact, 100 * 100 is all that is needed! Will talk about it later). It is difficult to overwrite object directly in R without extra memory assignment.

    Is there any way I can directly convert res.upper to a symmetric matrix without having to initialize the matrix res first?

    You do have to allocate memory for res because that is what you end up with; but there is no need to allocate memory for res.upper. You can initialize the upper triangular, while filling in the lower triangular at the same time. Consider the following template:

    #include <Rmath.h>  // use: double rnorm(double a, double b)
    #include <R.h>  // use: getRNGstate() and putRNGstate() for randomness
    #include <Rinternals.h>  // SEXP data type
    
    ## N is matrix dimension, a length-1 integer vector in R
    ## this function returns the matrix you want
    SEXP foo(SEXP N) {
      int i, j, n = asInteger(N);
      SEXP R_res = PROTECT(allocVector(REALSXP, n * n));  // allocate memory for `R_res`
      double *res = REAL(R_res);
      double tmp;  // a local variable for register reuse
      getRNGstate();
      for (i = 0; i < n; i++) {
        res[i * n + i] = 1.0;  // diagonal is 1, as you want
        for (j = i + 1; j < n; j++) {
          tmp = rnorm(0, 1);  
          res[j * n + i] = tmp; // initialize upper triangular
          res[i * n + j] = tmp;  // fill lower triangular
          }
        }
      putRNGstate();
      UNPROTECT(1);
      return R_res;
      }
    

    The code has not been optimized, as using integer multiplication j * n + i for addressing in the innermost loop will result in performance penalty. But I believe you can move multiplication outside the inner loop, and only leave addition inside.

    0 讨论(0)
提交回复
热议问题