I do not understand why the regex pattern containing the \\d
character class does not work but [0-9]
does. Character classes, such as \\s
According to the POSIX regular expression spec:
An ordinary character is any character in the supported character set, except for the ERE special characters listed in ERE Special Characters. The interpretation of an ordinary character preceded by a backslash ( '\' ) is undefined.
So the only characters that can legally follow a \
are:
\^ \. \[ \$ \( \) \|
\* \+ \? \{ \\
all of which match the escaped character literally. Trying to use any of of the other PCRE extensions may not work.
Trying either pattern in a strictly POSIX environment will likely end up having no matches; if you want to make the pattern truly POSIX compatible use all bracket expressions:
const char *rstr = "^[[:digit:]]+[[:space:]]+[[:alpha:]]+[[:space:]]+[[:digit:]]+[[:space:]]+[[:alpha:]]+$";
↳ POSIX Character_classes
\d is a perl and vim character class.
Use instead:
const char *rstr = "^[[:digit:]]+\\s+\\w+\\s+[[:digit:]]+\\s+\\w+$";
The regex flavor you're using is GNU ERE, which is similar to POSIX ERE, but with a few extra features. Among these are support for the character class shorthands \s
, \S
, \w
and \W
, but not \d
and \D
. You can find more info here.