Extract numeric part of strings of mixed numbers and characters in R

后端 未结 4 602
傲寒
傲寒 2020-11-28 07:22

I have a lot of strings, and each of which tends to have the following format: Ab_Cd-001234.txt I want to replace it with 001234. How can I achieve

4条回答
  •  情歌与酒
    2020-11-28 08:07

    The stringr package has lots of handy shortcuts for this kind of work:

    # input data following @agstudy
    data <-  c('Ab_Cd-001234.txt','Ab_Cd-001234.txt')
    
    # load library
    library(stringr)
    
    # prepare regular expression
    regexp <- "[[:digit:]]+"
    
    # process string
    str_extract(data, regexp)
    
    Which gives the desired result:
    
      [1] "001234" "001234"
    

    To explain the regexp a little:

    [[:digit:]] is any number 0 to 9

    + means the preceding item (in this case, a digit) will be matched one or more times

    This page is also very useful for this kind of string processing: http://en.wikibooks.org/wiki/R_Programming/Text_Processing

提交回复
热议问题