Java - What is the best way to find first duplicate character in a string

前端 未结 7 1361
我寻月下人不归
我寻月下人不归 2020-12-19 09:02

I have written below code for detecting first duplicate character in a string.

public static int detectDuplicate(String source) {
    boolean found = false;         


        
相关标签:
7条回答
  • 2020-12-19 09:33

    O(1) Algorithm

    Your solution is O(n^2) because of the two nested loops.

    The fastest algorithm to do this is O(1) (constant time):

    public static int detectDuplicate(String source) {
        boolean[] foundChars = new boolean[Character.MAX_VALUE+1];
        for(int i = 0; i < source.length(); i++) {
            if(i >= Character.MAX_VALUE) return Character.MAX_VALUE;
            char currentChar = source.charAt(i);
            if(foundChars[currentChar]) return i;
            foundChars[currentChar] = true;
        }
        return -1;
    }
    

    However, this is only fast in terms of big oh.

    0 讨论(0)
  • 2020-12-19 09:34

    As mentioned by others, your algorithm is O(n^2). Here is an O(N) algorithm, because HashSet#add runs in constant time ( the hash function disperses the elements properly among the buckets) - Note that I originally size the hashset to the maximum size to avoid resizing/rehashing:

    public static int findDuplicate(String s) {
        char[] chars = s.toCharArray();
        Set<Character> uniqueChars = new HashSet<Character> (chars.length, 1);
        for (int i = 0; i < chars.length; i++) {
            if (!uniqueChars.add(chars[i])) return i;
        }
        return -1;
    }
    

    Note: this returns the index of the first duplicate (i.e. the index of the first character that is a duplicate of a previous character). To return the index of the first appearance of that character, you would need to store the indices in a Map<Character, Integer> (Map#put is also O(1) in this case):

    public static int findDuplicate(String s) {
        char[] chars = s.toCharArray();
        Map<Character, Integer> uniqueChars = new HashMap<Character, Integer> (chars.length, 1);
        for (int i = 0; i < chars.length; i++) {
            Integer previousIndex = uniqueChars.put(chars[i], i);
            if (previousIndex != null) {
                return previousIndex;
            }
        }
        return -1;
    }
    
    0 讨论(0)
  • 2020-12-19 09:35

    This is O(n**2), not O(n). Consider the case abcdefghijklmnopqrstuvwxyzz. outerIndex will range from 0 to 25 before the procedure terminates, and each time it increments, innerIndex will have ranged from outerIndex to 26.

    To get to O(n), you need to make a single pass over the list, and to do O(1) work at each position. Since the job to do at each position is to check if the character has been seen before (and if so, where), that means you need an O(1) map implementation. A hashtable gives you that; so does an array, indexed by the character code.

    assylias shows how to do it with hashing, so here's how to do it with an array (just for laughs, really):

    public static int detectDuplicate(String source) {
        int[] firstOccurrence = new int[1 << Character.SIZE];
        Arrays.fill(firstOccurrence, -1);
        for (int i = 0; i < source.length(); i++) {
            char ch = source.charAt(i);
            if (firstOccurrence[ch] != -1) return firstOccurrence[ch];
            else firstOccurrence[ch] = i;
        }
        return -1;
    }
    
    0 讨论(0)
  • 2020-12-19 09:42

    You could try with:

     public static char firstRecurringChar(String s)
        {
        char x=' ';
        System.out.println("STRING : "+s);
        for(int i =0;i<s.length();i++)
        {
            System.out.println("CHAR AT "+i+" = " +s.charAt(i));
            System.out.println("Last index of CHAR AT "+i+" = " +s.lastIndexOf(s.charAt(i)));
            if(s.lastIndexOf(s.charAt(i)) >i){
                x=s.charAt(i);
                break;
            }
        }
        return x;
        } 
    
    0 讨论(0)
  • 2020-12-19 09:47

    I can substantially improve your algorithm. It should be done like this:

    StringBuffer source ...
    char charLast = source.charAt( source.len()-1 );
    int xLastChar = source.len()-1;
    source.setCharAt( xLastChar, source.charAt( xLastChar - 1 ) );
    int i = 1;
    while( true ){
        if( source.charAt(i) == source.charAt(i-1) ) break;
        i += 1;
    }
    source.setCharAt( xLastChar, charLast );
    if( i == xLastChar && source.charAt( xLastChar-1 ) != charLast ) return -1;
    return i;
    

    For a large string this algorithm is probably twice as fast as yours.

    0 讨论(0)
  • 2020-12-19 09:48

    The complexity is roughly O(M^2), where M is the minimum between the length of the string and the size of the set of possible characters K.

    You can get it down to O(M) with O(K) memory by simply memorizing the position where you first encounter every unique character.

    0 讨论(0)
提交回复
热议问题