Removing leading zeros from alphanumeric characters in R

后端 未结 2 1717
夕颜
夕颜 2020-12-10 03:53

I have a character vector d with alphanumeric characters

d <- c(\"012309 template\", \"separate 00340\", \"00045\", \"890 098\", \"3405 gara         


        
相关标签:
2条回答
  • 2020-12-10 04:41

    You could use a negative lookbehind to eliminate 0 unless preceded by a digit:

    > d <- c("100001", "012309 template", "separate 00340", "00045", "890 098", "3405 garage", "matter00908")
    > gsub("(?<![0-9])0+", "", d, perl = TRUE)
    [1] "100001"         "12309 template" "separate 340"   "45"            
    [5] "890 98"         "3405 garage"    "matter908"     
    

    Another way using regex:

    > gsub("(^|[^0-9])0+", "\\1", d, perl = TRUE)
    [1] "100001"         "12309 template" "separate 340"   "45"            
    [5] "890 98"         "3405 garage"    "matter908"     
    >
    
    0 讨论(0)
  • 2020-12-10 04:55

    Here's the solution utilizing stri_replace_all_regex from the stringi package:

    d <- c("012309 template", "separate 00340", "00045",
           "890 098", "3405 garage", "matter00908")
    library("stringi")
    stri_replace_all_regex(d, "\\b0*(\\d+)\\b", "$1")
    ## [1] "12309 template" "separate 340"   "45"             "890 98"
    ## [5] "3405 garage"    "matter00908"   
    

    Explanation: We are matching all sequences of digits within word boundaries (\b). Trailing zeros are matched greedily (0+). The remaining digits (\d denotes any digit, \d+ denotes their non-empty sequence) are captured within a group ((...)). Then we replace all such matches with the group-captured stuff only.

    If you'd also wish to remove 0s within words (as in your example), just omit \b and call:

    stri_replace_all_regex(d, "0*(\\d+)", "$1")
    ## [1] "12309 template" "separate 340"   "45"             "890 98"
    ## [5] "3405 garage"    "matter908"  
    
    0 讨论(0)
提交回复
热议问题