I have a large matrix:
set.seed(1)
a <- matrix(runif(9e+07),ncol=300)
I want to sort each row in the matrix:
> system
The package grr contains an alternate sort method that can be used to speed up this particular operation (I have reduced the matrix size somewhat so that this benchmark doesn't take forever) :
> set.seed(1)
> a <- matrix(runif(9e+06),ncol=300)
> microbenchmark::microbenchmark(sorted <- t(apply(a,1,sort))
+ ,sorted2 <- t(apply(a,1,sort.int, method='quick'))
+ ,sorted3 <- t(apply(a,1,grr::sort2)),times=3,unit='s')
Unit: seconds
expr min lq mean median uq max neval
sorted <- t(apply(a, 1, sort)) 1.7699799 1.865829 1.961853 1.961678 2.057790 2.153902 3
sorted2 <- t(apply(a, 1, sort.int, method = "quick")) 1.6162934 1.619922 1.694914 1.623551 1.734224 1.844898 3
sorted3 <- t(apply(a, 1, grr::sort2)) 0.9316073 1.003978 1.050569 1.076348 1.110049 1.143750 3
The difference becomes dramatic when the matrix contains characters:
> set.seed(1)
> a <- matrix(sample(letters,size = 9e6,replace = TRUE),ncol=300)
> microbenchmark::microbenchmark(sorted <- t(apply(a,1,sort))
+ ,sorted2 <- t(apply(a,1,sort.int, method='quick'))
+ ,sorted3 <- t(apply(a,1,grr::sort2)),times=3)
Unit: seconds
expr min lq mean median uq max neval
sorted <- t(apply(a, 1, sort)) 15.436045 15.479742 15.552009 15.523440 15.609991 15.69654 3
sorted2 <- t(apply(a, 1, sort.int, method = "quick")) 15.099618 15.340577 15.447823 15.581536 15.621925 15.66231 3
sorted3 <- t(apply(a, 1, grr::sort2)) 1.728663 1.733756 1.780737 1.738848 1.806774 1.87470 3
Results are identical for all three.
> identical(sorted,sorted2,sorted3)
[1] TRUE