Maximum XOR value faster than just using XOR

Given a number N and an array of integers (all nos less than 2^15). (A is size of array 100000)
Find Maximum XOR value of N and a integer from the array.

Q is no of queries (50000) and start, stop is the range in the array.

Input:
A Q
a1 a2 a3 ...
N start stop

Output:
Maximum XOR value of N and an integer in the array with the range specified.

Eg: Input
15 2 (2 is no of queries)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
10 6 10 (Query 1)
10 6 10 (Query 2)

Output:
13
13

Code:

for(int i=start-1;i<stop;i++){
int t =no[i]^a;
if(maxxor<t)
     maxxor=t;
}
cout << maxxor <<endl;

I need a algorithm 10-100 times faster than this. Sorting is too expensive. I have also tried binary trees,bit manipulation.

How about a 2x - 3x improvement?. Is that possible by optimization.

It is possible to develop faster algorithm.

Let's call bits of N: a[0], a[1], ..., a[15], e.g if N = 13 = 0000000 00001101 (in binary), then a[0] = a[1] = ... a[11] = 0, a[12] = 1, a[13] = 1, a[14] = 0, a[15] = 1.

The main idea of algorithm is following: If a[0] == 1, then best possible answer has this bit zeroed. If a[0] == 0, then best possible answer has one at this position. So at first you check if you have some number with the desired bit. If yes, you should take only number with this bit. If no, you take it's inverse. Then you process other bits in same manner. E.g. if a[0] == 1, a[1] == 0, you first check whether there is number beginning with zero, if yes then you check whether there is a number beginning with 01. If nothing begins with zero, then you check whether there is a number beggining with 11. And so on...

So you need a fast algorithm to answer following query: Is there a number beginning with bits ... in range start, stop?

One possibility: Constuct trie from binary representation of numbers. In each node store all positions where this prefix is in array (and sort them). Then answering to this query can be a simple walk through this trie. To check whether there is suitable prefix in start, stop range you should do a binary search over stored array in a node.

This could lead to algorithm with complexity O(lg^2 N) which is faster.

Here is the code, it hasn't been tested much, may contain bugs:

#include <cstdio>
#include <vector>
#include <algorithm>

using namespace std;

class TrieNode {
 public:
  TrieNode* next[2];
  vector<int> positions;

  TrieNode() {
    next[0] = next[1] = NULL;
  }

  bool HasNumberInRange(int start, int stop) {
    vector<int>::iterator it = lower_bound(
        positions.begin(), positions.end(), start);
    if (it == positions.end()) return false;
    return *it < stop;
  }
};

void AddNumberToTrie(int number, int index, TrieNode* base) {
  TrieNode* cur = base;
  // Go through all binary digits from most significant
  for (int i = 14; i >= 0; i--) {
    int digit = 0;
    if ((number & (1 << i)) != 0) digit = 1;
    cur->positions.push_back(index);
    if (cur->next[digit] == NULL) {
      cur->next[digit] = new TrieNode;
    }
    cur = cur->next[digit];
  }
  cur->positions.push_back(index);
}

int FindBestNumber(int a, int start, int stop, TrieNode* base) {
  int best_num = 0;
  TrieNode* cur = base;
  for (int i = 14; i >= 0; i--) {
    int digit = 1;
    if ((a & (1 << i)) != 0) digit = 0;
    if (cur->next[digit] == NULL || 
        !cur->next[digit]->HasNumberInRange(start, stop))
      digit = 1 - digit;
    best_num *= 2;
    best_num += digit;
    cur = cur->next[digit];
  }
  return best_num;
}


int main() {
  int n; scanf("%d", &n);
  int q; scanf("%d", &q);
  TrieNode base;
  for (int i = 0; i < n; i++) {
    int x; scanf("%d", &x);
    AddNumberToTrie(x, i, &base);
  }

  for (int i = 0; i < q; i++) {
    int a, start, stop;
    // Finds biggest i, such that start <= i < stop and XOR with a is as big as possible
    // Base index is 0
    scanf("%d %d %d", &a, &start, &stop);
    printf("%d\n", FindBestNumber(a, start, stop, &base)^a);
  }
}

Your algorithm runs in linear time (O(start-stop), or O(N) for the full range). If you can't assume that the input array already has a special ordering, you probably won't be able to get it any faster.

You only can try to optimize the overhead within the loop, but that surely won't give you a significant increase in speed.

edit:

As it seems you have to search the same list multiple time, but with different start- and end indexes.

That means that pre-sorting the array is also out of the question, because that would change the order of the elements. start and end would be meaningless.

What you could try to do is avoid processing the same range twice if one query fully contains an already scanned range.

Or maybe trying to consider all queries simultaneously while iterating throug the array.

If you have multiple queries with the same range, you can build a tree with the numbers in that range like this:

Use a binary tree of depth 15 where the numbers are at the leaves and a number corresponds to the path that leads to it (left is 0 and right is 1).

e.g. for 0 1 4 7:

    /   \
  /      /\
/ \     /  \
0 1    4    7

Then is your query is N=n_1 n_2 n_3 … n_15 where n_1 is the first bit of N, n_2 the second … Go from the root to a leaf and when you have to make a choice if n_i = 0 (where i is the depth of the current node) then go to the right, else go to the left. When you are on the leaf, it is the max leaf.

Original Answer for one query:

Your algorithm is optimal, you need to check all numbers in the array.

There may be a way to have a slightly faster program by using programming tricks, but it has no link with the algorithm.

I just come up with a solution that requires O(AlogM) time and space for preprocessing. And O(log²M) time for each query. M is the range of the integers, 2^15 in this problem.

For the
1st..Nth number, (Tree Group 1)
1st..(A/2)th number, (A/2)th..Ath number, (Tree Group 2)
1st..(A/4)th number, (A/4)th..(A/2)th number, (A/2)th..(3A/4)th, (3A/3)th..Ath, (Tree Group 3)
......., (Tree Group 4)
.......,
......., (Tree Group logA)
construct a binary trie of the binary representation of all number in the range. There would be 2M trees. But all trees aggregated will have no more than O(AlogM) elements. For a tree that include x numbers, there can be at most logM*x node in the tree. And each number is included in only one tree in each Tree Group.

For each query, you can split the range into several ranges (no more than 2logA) that we have processed into a tree. And for each tree, we can find the maximum XOR value in O(logM) time (will explain later). That is O(logA*logM) time.

How to find the maximum in a tree? Simply prefer the 1 child if the current digit is 0 in N, otherwise prefer the 0 child. If the preferred child exist, continue to that child, otherwise to the other.

yea or you could just calculate it and not waste time thinking about how to do it better.

int maxXor(int l, int r) {
    int highest_xor = 0;
    int base = l;
    int tbase = l;
    int val = 0;
    int variance = 0;
    do
    {
        while(tbase + variance <= r)
        {
            val = base ^ tbase + variance;
            if(val > highest_xor)
            {
                highest_xor = val;
            }
            variance += 1;
        }
        base +=1;
        variance = 0;
    }while(base <= r);
    return highest_xor;
}

来源：https://stackoverflow.com/questions/10734334/maximum-xor-value-faster-than-just-using-xor

标签

algorithm

xor