I saw this as an answer for finding repeated words in a string. But when I use it, it thinks This and is are the same and deletes the is
This
is
if unicodes are important than you should use this:
Pattern.compile("\\b(\\w+)(\\b\\W+\\b\\1\\b)*", Pattern.MULTILINE + Pattern.CASE_INSENSITIVE + Pattern.UNICODE_CHARACTER_CLASS)