str_extract specific patterns (example)

前端 未结 4 1875
轻奢々
轻奢々 2021-01-06 12:04

I\'m still a little confused by regex syntax. Can you please help me with these patterns:

_A00_A1234B_
_A00_A12345B_
_A1_A12345_

my approac

4条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2021-01-06 13:05

    You can try

    library(stringr)
    str_extract(str2, "[A-Z][0-9]{4,5}[A-Z]?")
    #[1] "A1234B"  "A12345B" "A12345" 
    

    Here, the pattern looks for a capital letter [A-Z], followed by 4 or 5 digits [0-9]{4,5}, followed by a capital letter [A-Z] ?

    Or you can use stringi which would be faster

    library(stringi)
     stri_extract(str2, regex="[A-Z][0-9]{4,5}[A-Z]?")
     #[1] "A1234B"  "A12345B" "A12345" 
    

    Or a base R option would be

     regmatches(str2,regexpr('[A-Z][0-9]{4,5}[A-Z]?', str2))
     #[1] "A1234B"  "A12345B" "A12345" 
    

    data

    str2 <- c('_A00_A1234B_', '_A00_A12345B_', '_A1_A12345_')
    

提交回复
热议问题