Arithmetic mean on a multidimensional array on R and MATLAB: drastic difference of performances

后端 未结 2 786
挽巷
挽巷 2020-12-14 12:38

I am working with multidimensional array both on R and MATLAB, these arrays have five dimensions (total of 14.5M of elements). I have to remove a dimension applying an arith

2条回答
  •  一整个雨季
    2020-12-14 13:00

    mean is particularly slow because of S3 method dispatch. This is faster:

    set.seed(42)
    a = array(data = runif(144*73*6*23*10), dim = c(144,73,10,6,23))
    
    system.time({b = apply(a, c(1,2,4,5), mean.default)})
    # user  system elapsed 
    #16.80    0.03   16.94
    

    If you don't need to handle NAs you can use the internal function:

    system.time({b1 = apply(a, c(1,2,4,5),  function(x) .Internal(mean(x)))})
    # user  system elapsed 
    # 6.80    0.04    6.86
    

    For comparison:

    system.time({b2 = apply(a, c(1,2,4,5),  function(x) sum(x)/length(x))})
    # user  system elapsed 
    # 9.05    0.01    9.08 
    
    system.time({b3 = apply(a, c(1,2,4,5),  sum)
                 b3 = b3/dim(a)[[3]]})
    # user  system elapsed 
    # 7.44    0.03    7.47
    

    (Note that all timings are only approximate. Proper benchmarking would require running this repreatedly, e.g., using one of the bechmarking packages. But I'm not patient enough for that right now.)

    It might be possible to speed this up with an Rcpp implementation.

提交回复
热议问题