Java Regex Escape Characters

后端 未结 4 971
被撕碎了的回忆
被撕碎了的回忆 2020-12-07 04:40

I\'m learning Regex, and running into trouble in the implementation.

I found the RegexTestHarness on the Java Tutorials, and running it, the following s

相关标签:
4条回答
  • 2020-12-07 05:26

    \ is special character in String literals "...". It is used to escape other special characters, or to create characters like \n \r \t.
    To create \ character in string literal which can be used in regex engine you need to escape it by adding another \ before it (just like you do in regex when you need to escape its metacharacters like dot \.). So String representing \ will look like "\\".

    This problem doesn't exist when you are reading data from user, because you are already reading literals, so even if user will write in console \n it will be interpreted as two characters \ and n.


    Also there is no point in adding | inside class character [...] unless your intention is to make that class also match | character, remember that [abc] is the same as (a|b|c) so there is no need for | in "[\\d|\\s]".

    0 讨论(0)
  • 2020-12-07 05:29

    If you want to represent a backslash in a Java string literal you need to escape it with another backslash, so the string literal "\\s" is two characters, \ and s. This means that to represent the regular expression [\d\s][\d]\. in a Java string literal you would use "[\\d\\s][\\d]\\.".

    Note that I also made a slight modification to your regular expression, [\d|\s] will match a digit, whitespace, or the literal | character. You just want [\d\s]. A character class already means "match one of these", since you don't need the | for alternation within a character class it loses its special meaning.

    0 讨论(0)
  • 2020-12-07 05:31

    What is happening is that escape sequences are being evaluated twice. Once for java, and then once for your regex.

    the result is that you need to escape the escape character, when you use a regex escape sequence.

    for instance, if you needed a digit, you'd use

    "\\d"
    
    0 讨论(0)
  • 2020-12-07 05:39

    My pattern is any double digit or single digit preceded by a space, followed by a period.)

    Correct regex will be:

    Pattern pattern = Pattern.compile("(\\s\\d|\\d{2})\\.");
    

    Also if you're getting regex string from user input then your should call:

    Pattern.quote(useInputRegex);
    

    To escape all the regex special characters.

    Also you double escaping because 1 escape is handled by String class and 2nd one is passed on to regex engine.

    0 讨论(0)
提交回复
热议问题