Java, escaping (using) quotes in a regex

前端 未结 2 2040
悲&欢浪女
悲&欢浪女 2020-12-18 13:12

I\'m trying to use the following regex in Java, that\'s supposed to match any lang=\"2-char-lang-name\":

String lang = \"lang=\\\"\" + L.detectL         


        
相关标签:
2条回答
  • 2020-12-18 13:48

    Three slashes would be correct (\\ + \" becomes \ + " = \"). (Update: Actually, it turns out that isn't even necessary. A single slash also works, it seems.) The problem is your use of [..]; the [] symbols mean "any of the characters in here" (so [..] just means "any character").

    Drop the [] and you should be getting what you want:

    String ab = "foo=\"bar\" lang=\"AB\"";
    String regex = "lang=\\\"..\\\"";
    String cd = ab.replaceFirst(regex, "lang=\"CD\"");
    System.out.println(cd);
    

    Output:

    foo="bar" lang="CD"
    
    0 讨论(0)
  • 2020-12-18 14:01

    Have you tried it with a single backslash? The output of

    public static void main(String[] args) {
      String inputString = "<xml lang=\"the Queen's English\">";
      System.out.println(inputString.replaceFirst("lang=\"[^\"]*\"", "lang=\"American\"" ));
    }
    

    is

    <xml lang="American">
    

    which, if I'm reading you correctly, is what you want.

    EDIT to add: the reason a single backslash works is that it's not actually part of the string, it's just part of the syntax for expressing the string. The length of the string "\"" is 1, not 2, and the method replaceFirst just sees a string containing a " (with no backslash). This is why e.g. \s (the whitespace character class in a regex) has to be written \\s in a Java string literal.

    On the wisdom of using regex: this should be fine, if you're sure about the format of the files you're processing. If the files might include a commented-out header complete with lang spec above the real header, you could be in trouble!

    0 讨论(0)
提交回复
热议问题