When using regex in C, \d does not work but [0-9] does

前端 未结 4 574
面向向阳花
面向向阳花 2020-12-11 08:07

I do not understand why the regex pattern containing the \\d character class does not work but [0-9] does. Character classes, such as \\s

相关标签:
4条回答
  • 2020-12-11 08:20

    According to the POSIX regular expression spec:

    An ordinary character is any character in the supported character set, except for the ERE special characters listed in ERE Special Characters. The interpretation of an ordinary character preceded by a backslash ( '\' ) is undefined.

    So the only characters that can legally follow a \ are:

    \^    \.    \[    \$    \(    \)    \|
    \*    \+    \?    \{    \\
    

    all of which match the escaped character literally. Trying to use any of of the other PCRE extensions may not work.

    0 讨论(0)
  • 2020-12-11 08:21

    Trying either pattern in a strictly POSIX environment will likely end up having no matches; if you want to make the pattern truly POSIX compatible use all bracket expressions:

    const char *rstr = "^[[:digit:]]+[[:space:]]+[[:alpha:]]+[[:space:]]+[[:digit:]]+[[:space:]]+[[:alpha:]]+$";
    

    ↳ POSIX Character_classes

    0 讨论(0)
  • 2020-12-11 08:38

    \d is a perl and vim character class.

    Use instead:

     const char *rstr = "^[[:digit:]]+\\s+\\w+\\s+[[:digit:]]+\\s+\\w+$"; 
    
    0 讨论(0)
  • 2020-12-11 08:42

    The regex flavor you're using is GNU ERE, which is similar to POSIX ERE, but with a few extra features. Among these are support for the character class shorthands \s, \S, \w and \W, but not \d and \D. You can find more info here.

    0 讨论(0)
提交回复
热议问题