I\'m having some problems matching a pattern with a string of text in R.
I\'m trying to get TRUE with grepl when the text is s
Although stringr ICU regex engines supports bare POSIX character classes in the pattern, in base R regex flavors (both PCRE (perl=TRUE) and TRE), POSIX character classes must be inside bracket expressions. [:alnum:] -> [[:alnum:]].
x <- c("AZaz09 y AZaz09", "ĄŻaz09 y AZŁł09", "26 de Marzo y Pareyra de la Luz")
grepl("[[:alnum:][:blank:]]+[[:blank:]][yY][[:blank:]][[:alnum:][:blank:]]+", x)
## => [1] TRUE TRUE TRUE
grepl("[[:alnum:][:blank:]]+[[:blank:]][yY][[:blank:]][[:alnum:][:blank:]]+", x, perl=TRUE)
## => [1] TRUE TRUE TRUE
See the online demo
When you use [:alnum:] alone, it is a simple bracket expression that matches a single character, a :, a, l, n, u, m.
Pattern details:
[[:alnum:][:blank:]]+ - 1+ alphanumeric or horizontal whitespace symbols[[:blank:]] - 1 horizontal whitespace symbols[yY] - either y or Y[[:blank:]] - 1 horizontal whitespace symbols[[:alnum:][:blank:]]+ - 1+ alphanumeric or horizontal whitespace symbols