Difference between `paste`, `str_c`, `str_join`, `stri_join`, `stri_c`, `stri_paste`?

后端 未结 1 624
执笔经年
执笔经年 2020-12-10 18:28

What are the differences between all of these functions that seem very similar ?

相关标签:
1条回答
  • 2020-12-10 18:59
    • stri_join, stri_c, and stri_paste come from package stringi and are pure aliases

    • str_c comes from stringr and is just stringi::stri_join with a parameter ignore_null hardcoded to TRUE while stringi::stri_join has it set to FALSE by default. stringr::str_join is a deprecated alias for str_c

    see:

    library(stringi)
    identical(stri_join, stri_c)
    # [1] TRUE
    identical(stri_join, stri_paste)
    # [1] TRUE
    
    library(stringr)
    str_c
    # function (..., sep = "", collapse = NULL) 
    # {
    #   stri_c(..., sep = sep, collapse = collapse, ignore_null = TRUE)
    # }
    # <environment: namespace:stringr>
    

    stri_join is very similar to base::paste with a few differences enumerated below:


    1. sep = "" by default

    So it behaves more like paste0 by default, but paste0 lost its sep argument.

    identical(paste0("a","b")        , stri_join("a","b"))
    # [1] TRUE
    identical(paste("a","b")         , stri_join("a","b",sep=" "))
    # [1] TRUE
    identical(paste("a","b", sep="-"), stri_join("a","b", sep="-"))
    # [1] TRUE
    

    str_c will behave just like stri_join here.


    2. Behavior with NA

    if you paste to NA using stri_join, the result is NA, while paste converts NA to "NA"

    paste0(c("a","b"),c("c",NA))
    # [1] "ac"  "bNA"
    stri_join(c("a","b"),c("c",NA))
    # [1] "ac" NA
    

    str_c will behave just like stri_join here as well


    3. Behavior with length 0 arguments

    When a length 0 value is encountered, character(0) is returned, except if ignore_null is set to FALSE, then the value is ignored. It is different from the behavior of paste which would convert the length 0 value to "" and thus contain 2 consecutive separators in the output.

    stri_join("a",NULL, "b")  
    # [1] character(0)
    stri_join("a",character(0), "b")  
    # [1] character(0)
    
    paste0("a",NULL, "b")
    # [1] "ab"
    stri_join("a",NULL, "b", ignore_null = TRUE)
    # [1] "ab"
    str_c("a",NULL, "b")
    # [1] "ab"
    
    paste("a",NULL, "b") # produces double space!
    # [1] "a  b" 
    stri_join("a",NULL, "b", ignore_null = TRUE, sep = " ")
    # [1] "a b"
    str_c("a",NULL, "b", sep = " ")
    # [1] "a b"
    

    4. stri_join warns more

    paste(c("a","b"),c("c","d","e"))
    # [1] "a c" "b d" "a e"
    paste("a","b", sep = c(" ","-"))
    # [1] "a b"
    
    stri_join(c("a","b"),c("c","d","e"), sep = " ")
    # [1] "a c" "b d" "a e"
    # Warning message:
    #   In stri_join(c("a", "b"), c("c", "d", "e"), sep = " ") :
    #   longer object length is not a multiple of shorter object length
    stri_join("a","b", sep = c(" ","-"))
    # [1] "a b"
    # Warning message:
    #   In stri_join("a", "b", sep = c(" ", "-")) :
    #   argument `sep` should be one character string; taking the first one
    

    5. stri_join is faster

    microbenchmark::microbenchmark(
      stringi = stri_join(rep("a",1000000),rep("b",1000),"c",sep=" "),
      base    = paste(rep("a",1000000),rep("b",1000),"c")
    )
    
    # Unit: milliseconds
    #    expr       min       lq      mean    median       uq      max neval cld
    # stringi  88.54199  93.4477  97.31161  95.17157  96.8879 131.9737   100  a 
    # base    166.01024 169.7189 178.31065 171.30910 176.3055 215.5982   100   b
    
    0 讨论(0)
提交回复
热议问题