Using identical() in R with multiple vectors

前端未结

关注

 6  609

没有蜡笔的小新 2020-12-15 15:48

Suppose that I have five vectors:

A<-1:10
B<-1:10
C<-1:10
D<-1:10
E<-1:12

I could test two at a time using identical( ).

6条回答

天命终不由人 (楼主)

2020-12-15 16:12

I had the same problem but decided to implement a solution based on Reduce and one based on a double for loop.

Functions:

all_elements_the_same = function(list) {

  #func to compare with
  comparison_func = function(x, y) {
    if (!identical(x, y)) stop() #stop function if it finds a non-identical pair
    y #return second element
  }

  #run comparisons
  trial = try({
    Reduce(f = comparison_func, x = list, init = list[[1]])
  }, silent = T)

  #return
  if (class(trial) == "try-error") return(F)
  T
}

all_elements_the_same2 = function(list, ignore_names = F) {
  #double loop solution
  for (i in seq_along(list)) {
    for (j in seq_along(list)) {
      #skip if comparing to self or if comparison already done
      if (i >= j) next

      #check
      if (!identical(list[[i]], list[[j]])) return(F)
    }
  }
  T
}

Test objects:

l_testlist_ok = list(1:3, 1:3, 1:3, 1:3, 1:3, 1:3)
l_testlist_bad = list(1:3, 1:3, 1:4, 1:3, 1:3, 1:3)
l_testlist_bad2 = list(1:3, 1:3, 1:4, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3)

Test functionality:

> all_elements_the_same(l_testlist_ok)
[1] TRUE
> all_elements_the_same(l_testlist_bad)
[1] FALSE
> all_elements_the_same(l_testlist_bad2)
[1] FALSE
> all_elements_the_same2(l_testlist_ok)
[1] TRUE
> all_elements_the_same2(l_testlist_bad)
[1] FALSE
> all_elements_the_same2(l_testlist_bad2)
[1] FALSE

Test time use:

> library(microbenchmark)
> microbenchmark(all_elements_the_same(l_testlist_ok),
+ all_elements_the_same(l_testlist_bad),
+ all_elements_the_same(l_testlist_bad2),
+ all_elements_the_same2(l_testlist_ok),
+ all_elements_the_same2(l_testlist_bad),
+ all_elements_the_same2(l_testlist_bad2), times = 1e4)
Unit: microseconds
                                    expr    min      lq       mean  median      uq      max neval
    all_elements_the_same(l_testlist_ok) 19.310  25.454  28.309016  26.917  28.380 1003.228 10000
   all_elements_the_same(l_testlist_bad) 93.624 100.938 108.890823 103.863 106.497 3130.807 10000
  all_elements_the_same(l_testlist_bad2) 93.331 100.938 107.963741 103.863 106.497 1181.404 10000
   all_elements_the_same2(l_testlist_ok) 48.275  53.541  57.334095  55.881  57.930  926.866 10000
  all_elements_the_same2(l_testlist_bad)  6.144   7.315   8.437603   7.900   8.778  998.839 10000
 all_elements_the_same2(l_testlist_bad2)  6.144   7.315   8.564780   8.192   8.778 1323.594 10000

So apparently, the try part slows it down considerably. It may still save time to use the Reduce variant if one has very large objects, but for smaller objects, double for loop seems the way to go.

0 讨论(0)

查看其它6个回答