Extract characters that differ between two strings

后端 未结 6 1951
小蘑菇
小蘑菇 2020-12-03 08:09

I have used adist to calculate the number of characters that differ between two strings:

a <- \"Happy day\"
b <- \"Tappy Pay\"
adist(a,b)          


        
6条回答
  •  时光取名叫无心
    2020-12-03 08:48

    As long as a and b have the same length we can do this:

    s.a <- strsplit(a, "")[[1]]
    s.b <- strsplit(b, "")[[1]]
    paste(s.a[s.a != s.b], collapse = "")
    

    giving:

    [1] "Hd"
    

    This seems straightforward in terms of clarity of the code and seems tied for the fastest of the solutions provided here although I think I prefer f3:

    f1 <- function(a, b)
      paste(setdiff(strsplit(a,"")[[1]],strsplit(b,"")[[1]]), collapse = "")
    
    f2 <- function(a, b)
      paste(sapply(setdiff(utf8ToInt(a), utf8ToInt(b)), intToUtf8), collapse = "")
    
    f3 <- function(a, b) 
      paste(Reduce(setdiff, strsplit(c(a, b), split = "")), collapse = "")
    
    f4 <- function(a, b) {
      s.a <- strsplit(a, "")[[1]]
      s.b <- strsplit(b, "")[[1]]
      paste(s.a[s.a != s.b], collapse = "")
    }
    
    a <- "Happy day"
    b <- "Tappy Pay"
    
    library(rbenchmark)
    benchmark(f1, f2, f3, f4, replications = 10000, order = "relative")[1:4]
    

    giving the following on a fresh session on my laptop:

      test replications elapsed relative
    3   f3        10000    0.07    1.000
    4   f4        10000    0.07    1.000
    1   f1        10000    0.09    1.286
    2   f2        10000    0.10    1.429
    

    I have assumed that the differences must be in the corresponding character positions. You might want to clarify if that is the intention or not.

提交回复
热议问题