Regex in java for finding duplicate consecutive words

后端 未结 6 1859
时光说笑
时光说笑 2020-12-14 19:26

I saw this as an answer for finding repeated words in a string. But when I use it, it thinks This and is are the same and deletes the is

6条回答
  •  时光取名叫无心
    2020-12-14 19:55

    you should have used \b(\w+)\b\s+\b\1\b, click here to see the result...

    Hope this is what you want...

    Update 1

    Well well well, the output that you have is

    the final string after removing duplicates

    import java.util.regex.*;
    
    public class MyDup {
        public static void main (String args[]) {
        String input="This This is text text another another";
        String originalText = input;
        String output = "";
        Pattern p = Pattern.compile("\\b(\\w+)\\b\\s+\\b\\1\\b", Pattern.MULTILINE+Pattern.CASE_INSENSITIVE);
        Matcher m = p.matcher(input);
        System.out.println(m);
        if (!m.find())
            output = "No duplicates found, no changes made to data";
        else
        {
            while (m.find())
            {
                if (output == "") {
                    output = input.replaceFirst(m.group(), m.group(1));
                } else {
                    output = output.replaceAll(m.group(), m.group(1));
                }
            }
            input = output;
            m = p.matcher(input);
            while (m.find())
            {
                output = "";
                if (output == "") {
                    output = input.replaceAll(m.group(), m.group(1));
                } else {
                    output = output.replaceAll(m.group(), m.group(1));
                }
            }
        }
        System.out.println("After removing duplicate the final string is " + output);
    }
    

    Run this code and see what you get as output... Your queries will be solved...

    Note

    In output you are replacing duplicate by single word... Isn't it??

    When I put System.out.println(m.group() + " : " + m.group(1)); in first if condition I get output as text text : text i.e. duplicates are replacing by single word.

    else
        {
            while (m.find())
            {
                if (output == "") {
                    System.out.println(m.group() + " : " + m.group(1));
                    output = input.replaceFirst(m.group(), m.group(1));
                } else {
    

    Hope you got now what is going on... :)

    Good Luck!!! Cheers!!!

提交回复
热议问题