We are looking for an algorithm with the following criteria.
Input is an arbitrary positive integer (n
), that represents the length of the compare subse
Before finding the FKM algorithm, I fiddled around with a simple recursive algorithm that tries every combination of 0's and 1's and returns the (lexicographically) first result. I found that this method quickly runs out of memory (at least in JavaScript in a browser), so I tried to come up with an improved non-recursive version, based on these observations:
By running through the N-length binary strings from 0 to 2N-1, and checking whether they are already present in the sequence, and if not, checking whether they overlap partially with the end of the sequence, you can build up the lexicographically smallest binary De Bruijn sequence with N-length chunks instead of per-bit.
You only need to go through the N-length binary strings up to 2N-1-1, and then append 2N-1 without overlap. The N-length strings starting with a '1' need not be checked.
You can skip the even numbers greater than 2; they are bit-shifted versions of smaller numbers that are already in the sequence. The number 2 is needed to avoid 1 and 3 incorrectly overlapping; code-wise, you can fix this by starting the sequence with 0, 1 and 2 already in place (e.g. 0000010
for N=5) and then iterating over every odd number starting at 3.
Example for N=5:
0 00000
1 00001
2 00010
3 00011
4 (00100)
5 00101
6 (00110)
7 00111
8 (01000)
9 (01001)
10 (01010)
11 01011
12 (01100)
13 01101
14 (01110)
15 01111
+10000
=> 000001000110010100111010110111110000
As you can see, the sequence is built with the strings 00000
to 01111
and the appended 10000
, and the strings 10001
to 11111
need not be checked. All even numbers greater than 2 can be skipped (as could the numbers 9 and 13).
This code example shows a simple implementation in JavaScript. It's fast up to N=14 or so, and will give you all 1,048,595 characters for N=20 if you have a few minutes.
function binaryDeBruijn(n) {
var zeros = "", max = Math.pow(2, n - 1); // check only up to 2^(N-1)
for (var i = 1; i < n; i++) zeros += "0";
var seq = zeros + (n > 2 ? "010" : "0"); // start with 0-2 precalculated
for (var i = 3; i < max; i += 2) { // odd numbers from 3
var part = (zeros + i.toString(2)).substr(-n, n); // binary with leading zeros
if (seq.indexOf(part) == -1) { // part not already in sequence
for (var j = n - 1; j > 0; j--) { // try partial match at end
if (seq.substr(-j, j) == part.substr(0, j)) break; // partial match found
}
seq += part.substr(j, n); // overlap with end or append
}
}
return seq + "1" + zeros; // append 2^(N-1)
}
document.write(binaryDeBruijn(10));
There are other numbers besides the even numbers which could be skipped (e.g. the numbers 9 and 13 in the example); if you could predict these numbers, this would of course make the algorithm much more efficient, but I'm not sure there's an obvious pattern there.