I\'m still a little confused by regex syntax. Can you please help me with these patterns:
_A00_A1234B_
_A00_A12345B_
_A1_A12345_
my approac
You can try
library(stringr)
str_extract(str2, "[A-Z][0-9]{4,5}[A-Z]?")
#[1] "A1234B" "A12345B" "A12345"
Here, the pattern looks for a capital letter [A-Z], followed by 4 or 5 digits [0-9]{4,5}, followed by a capital letter [A-Z] ?
Or you can use stringi which would be faster
library(stringi)
stri_extract(str2, regex="[A-Z][0-9]{4,5}[A-Z]?")
#[1] "A1234B" "A12345B" "A12345"
Or a base R option would be
regmatches(str2,regexpr('[A-Z][0-9]{4,5}[A-Z]?', str2))
#[1] "A1234B" "A12345B" "A12345"
str2 <- c('_A00_A1234B_', '_A00_A12345B_', '_A1_A12345_')