Match a word using regex that also handles apostrophes

前端 未结 2 1305
[愿得一人]
[愿得一人] 2021-01-13 18:34

I have to separate a line of text into words, and am confused on what regex to use. I have looked everywhere for a regex that matches a word and found ones similar to this

2条回答
  •  春和景丽
    2021-01-13 19:21

    Using answer from WhirlWind on the page stated in my comment you can do the following:

    String candidate = "I \n"+
        "like \n"+
        "to "+
        "eat "+
        "but "+
        "I "+
        "don't "+
        "like "+
        "to "+
        "eat "+
        "everyone's "+
        "food "+
        "''  ''''  '.' ' "+
        "or "+
        "they'll "+
        "starv'e'";
    
    String regex = "('\\w+)|(\\w+'\\w+)|(\\w+')|(\\w+)";
    Matcher matcher = Pattern.compile(regex).matcher(candidate);
    while (matcher.find()) {
      System.out.println("> matched: `" + matcher.group() + "`");
    }
    

    It will print:

    > matched: `I`
    > matched: `like`
    > matched: `to`
    > matched: `eat`
    > matched: `but`
    > matched: `I`
    > matched: `don't`
    > matched: `like`
    > matched: `to`
    > matched: `eat`
    > matched: `everyone's`
    > matched: `food`
    > matched: `or`
    > matched: `they'll`
    > matched: `starv'e`
    

    You can find a running example here: http://ideone.com/pVOmSK

提交回复
热议问题