Pattern.UNIX_LINES in regex with Java

后端 未结 2 1267
无人共我
无人共我 2021-01-21 20:20

Hi from the java doc here the following:

UNIX_LINES

public static final int UNIX_LINES

Enables Unix

2条回答
  •  梦谈多话
    2021-01-21 20:59

    I will try to explain it on . since same rule apply for ^ and $.

    Normally dot . matches every character except new line. In Unix only \n is new line mark, so other characters like carriage return \r are threated as normal characters.

    Take a look at this String "A\r\nB\rC\nD". If you will try to find match for regex like.+ using

    String data = "A\r\nB\rC\nD";
    System.out.println(data);
    Matcher m = Pattern.compile(".+").matcher(data);
    while (m.find()) {
        System.out.println("["+m.group()+"]");
    }
    

    you will get

    [A]
    [B]
    [C]
    [D]
    

    but if add flag Pattern.UNIX_LINES characters like \r will also be possible match for . and output will change into

    [A
    ]
    [B
    C]
    [D]
    

    So first match is [A\r], second [B\rC] and third [C]

提交回复
热议问题