Explain the use of a bit vector for determining if all characters are unique

前端 未结 12 1838
野性不改
野性不改 2020-12-04 04:23

I am confused about how a bit vector would work to do this (not too familiar with bit vectors). Here is the code given. Could someone please walk me through this?

         


        
12条回答
  •  广开言路
    2020-12-04 05:01

    Simple Explanation (with JS code below)

    • An integer variable per machine code is a 32-bit array
    • All bit wise operations are 32-bit
    • They're agnostic of OS / CPU architecture or chosen number system of the language, e.g. DEC64 for JS.
    • This duplication finding approach is similar to storing characters in an array of size 32 where, we set 0th index if we find a in the string, 1st for b & so on.
    • A duplicate character in the string will have its corresponding bit occupied, or, in this case, set to 1.
    • Ivan has already explained: How this index calculation works in this previous answer.

    Summary of operations:

    • Perform AND operation between checker & index of the character
    • Internally both are Int-32-Arrays
    • It is a bit-wise operation between these 2.
    • Check if the output of the operation was 1
    • if output == 1
      • The checker variable has that particular index-th bit set in both arrays
      • Thus it's a duplicate.
    • if output == 0
      • This character hasn't been found so far
      • Perform an OR operation between checker & index of the character
      • Thereby, updating the index-th bit to 1
      • Assign the output to checker

    Assumptions:

    • We've assumed we'll get all lower case characters
    • And, that size 32 is enough
    • Hence, we began our index counting from 96 as reference point considering the ascii code for a is 97

    Given below is the JavaScript source code.

    function checkIfUniqueChars (str) {
    
        var checker = 0; // 32 or 64 bit integer variable 
    
        for (var i = 0; i< str.length; i++) {
            var index = str[i].charCodeAt(0) - 96;
            var bitRepresentationOfIndex = 1 << index;
    
            if ( (checker & bitRepresentationOfIndex) > 1) {
                console.log(str, false);
                return false;
            } else {
                checker = (checker | bitRepresentationOfIndex);
            }
        }
        console.log(str, true);
        return true;
    }
    
    checkIfUniqueChars("abcdefghi");  // true
    checkIfUniqueChars("aabcdefghi"); // false
    checkIfUniqueChars("abbcdefghi"); // false
    checkIfUniqueChars("abcdefghii"); // false
    checkIfUniqueChars("abcdefghii"); // false
    

    Note that in JS, despite integers being of 64 bits, a bit wise operation is always done on 32 bits.

    Example: If the string is aa then:

    // checker is intialized to 32-bit-Int(0)
    // therefore, checker is
    checker= 00000000000000000000000000000000
    

    i = 0

    str[0] is 'a'
    str[i].charCodeAt(0) - 96 = 1
    
    checker 'AND' 32-bit-Int(1) = 00000000000000000000000000000000
    Boolean(0) == false
    
    // So, we go for the '`OR`' operation.
    
    checker = checker OR 32-bit-Int(1)
    checker = 00000000000000000000000000000001
    

    i = 1

    str[1] is 'a'
    str[i].charCodeAt(0) - 96 = 1
    
    checker= 00000000000000000000000000000001
    a      = 00000000000000000000000000000001
    
    checker 'AND' 32-bit-Int(1) = 00000000000000000000000000000001
    Boolean(1) == true
    // We've our duplicate now
    

提交回复
热议问题