I have to separate a line of text into words, and am confused on what regex to use. I have looked everywhere for a regex that matches a word and found ones similar to this
Using answer from WhirlWind
on the page stated in my comment you can do the following:
String candidate = "I \n"+
"like \n"+
"to "+
"eat "+
"but "+
"I "+
"don't "+
"like "+
"to "+
"eat "+
"everyone's "+
"food "+
"'' '''' '.' ' "+
"or "+
"they'll "+
"starv'e'";
String regex = "('\\w+)|(\\w+'\\w+)|(\\w+')|(\\w+)";
Matcher matcher = Pattern.compile(regex).matcher(candidate);
while (matcher.find()) {
System.out.println("> matched: `" + matcher.group() + "`");
}
It will print:
> matched: `I`
> matched: `like`
> matched: `to`
> matched: `eat`
> matched: `but`
> matched: `I`
> matched: `don't`
> matched: `like`
> matched: `to`
> matched: `eat`
> matched: `everyone's`
> matched: `food`
> matched: `or`
> matched: `they'll`
> matched: `starv'e`
You can find a running example here: http://ideone.com/pVOmSK