How do I match all occurrences in R?

前端 未结 2 1099
梦谈多话
梦谈多话 2020-12-18 03:40

I have a list of 1000 names. (say A) I have another list of 5 names. (say B) I want to find out at which row number the 5 names occur in the 1000 number list.

eg. Am

相关标签:
2条回答
  • 2020-12-18 04:24
     A <- sample(1:10, 100, 100) ## generate sample data 
     B <- 1:5
    
     A %in% B
    [1] FALSE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE FALSE FALSE FALSE  TRUE  TRUE
    [13] FALSE  TRUE FALSE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE
    [25] FALSE FALSE  TRUE  TRUE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE  TRUE
    [37] FALSE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE FALSE  TRUE FALSE FALSE
    [49]  TRUE FALSE  TRUE  TRUE FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE
    [61]  TRUE  TRUE FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE
    [73]  TRUE  TRUE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE
    [85]  TRUE  TRUE  TRUE  TRUE FALSE FALSE  TRUE FALSE  TRUE  TRUE  TRUE FALSE
    [97] FALSE FALSE FALSE  TRUE
    
    which(A %in% B)
     [1]   2   3   4   5   7  11  12  14  16  17  20  21  22  23  24  27  28  30  31
    [20]  36  38  39  40  41  43  44  46  49  51  52  55  56  61  62  67  69  71  73
    [39]  74  79  85  86  87  88  91  93  94  95 100
    
    
    lapply(B, function(x) which(A %in% x)) 
    [[1]]
     [1]  5 22 23 36 40 49 69
    
    [[2]]
    [1] 21 30 39 44 46 56 61 85 93
    
    [[3]]
    [1]  2  7 14 28 38 51 62 73 87 91
    
    [[4]]
     [1]  3  4 11 12 20 24 27 41 43 52 55 71 74 79 88
    
    [[5]]
    [1]  16  17  31  67  86  94  95 100
    

    without lapply, you dont know which element of B is where in A from this do you?

    0 讨论(0)
  • 2020-12-18 04:30

    The package grr contains a function matches which will allow you to find all matches of all elements from A in B. It can return the result as a two column lookup table or in the list format returned by lapply(B, function(x) which(A %in% x)). However, it is orders of magnitude faster:

    > A <- as.character(sample(1:1000, 1e5,TRUE)) ## generate sample data
    > B <- as.character(1:500)
    > microbenchmark::microbenchmark(result<-lapply(B, function(x) which(A %in% x)),result2<-grr::matches(B,A,list=TRUE,all.y=FALSE),times=10)
    Unit: milliseconds
                                                          expr        min         lq       mean     median         uq        max neval
              result <- lapply(B, function(x) which(A %in% x)) 1193.50104 1218.60509 1276.58727 1237.82048 1253.76487 1497.18798    10
     result2 <- grr::matches(B, A, list = TRUE, all.y = FALSE)   54.83836   56.28509   57.39188   57.79095   58.17673   59.46505    10
    
    0 讨论(0)
提交回复
热议问题