How to split paragraphs into sentences?

空扰寡人 提交于 2019-11-29 02:26:07
Ruchira Gayan Ranaweera

You can try this

String str = "This is how I tried to split a paragraph into a sentence. But, there is a problem. My paragraph includes dates like Jan.13, 2014 , words like U.S and numbers like 2.2. They all got split by the above code.";

Pattern re = Pattern.compile("[^.!?\\s][^.!?]*(?:[.!?](?!['\"]?\\s|$)[^.!?]*)*[.!?]?['\"]?(?=\\s|$)", Pattern.MULTILINE | Pattern.COMMENTS);
Matcher reMatcher = re.matcher(str);
while (reMatcher.find()) {
    System.out.println(reMatcher.group());
}

Output:

This is how I tried to split a paragraph into a sentence.
But, there is a problem.
My paragraph includes dates like Jan.13, 2014 , words like U.S and numbers like 2.2.
They all got split by the above code.
Sathesh
String[] sentenceHolder = titleAndBodyContainer.split("(?i)(?<=[.?!])\\S+(?=[a-z])");

Try this it worked for me.

Manas Kandekar

This will split the paragraph by . ? !:

String a[]=str.split("\\.|\\?|\\!");

You can put any symbol after \\ which you want to use and use | to separate each condition.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!