Why is the diag function so slow? [in R 3.2.0 or earlier]

别说谁变了你拦得住时间么 提交于 2019-11-30 04:12:21

Summary

As of R version 3.2.1 (World-Famous Astronaut) diag() has received an update. The discussion moved to r-devel where it was noted that c() strips non-name attributes and may have been why it was placed there. While some people worried that removing c() would cause unknown issues on matrix-like objects, Peter Dalgaard found that, "The only case where the c() inside diag() has an effect is where M[i,j] != M[(i-1)*m+j] AND c(M) will stringize M in column-major order, so that M[i,j] == c(M)[(i-1)*m+j]."

Luke Tierney tested @Frank 's removal of c(), finding it did not effect anything on CRAN or BIOC and so was implemented to replace c(x)[...] with x[...] on line 27. This leads to relatively large speedups in diag(). Below is a speed test showing the improvement with R 3.2.1's version of diag().

library(microbenchmark)
nc  <- 1e4
set.seed(1)
m <- matrix(sample(letters,nc^2,replace=TRUE), ncol = nc)

    microbenchmark(diagOld(m),diag(m))
    Unit: microseconds
           expr        min          lq        mean      median         uq        max neval
     diagOld(m) 451189.242 526622.2775 545116.5668 531905.5635 540008.704 682223.733   100
        diag(m)    222.563    646.8675    644.7444    714.4575    740.701   1015.459   100
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!