Finding permutation of a set of 0 and 1, given index with O(N)

问题

I am trying to find the most efficient way to find permutation on a set of '0' and '1' given an index.

Ex: Given l = [0, 0, 1, 1]. All permutations in an ascending order is {0011, 0101, 0110, 1001, 1010, 1100}. These elements are indexed from 0 -> 5. Given index = 2, the result is 0110.

I found the algorithm here which inputs an integer multiset (e.g. l = [1, 2, 2]). His algorithm is efficient (O(N^2)). However, my multiset only consists of '0' and '1' and requires O(N) or less. N is the length of the list

Could you please kindly help me. Note that my real test is big (len(l) is 1024), so intertool library is not suitable. I am trying to speed up it as much as possible (e.g, using gmpy2...)

Based on 1, the following is my try but it is O(N^2)

from collections import Counter
from math import factorial
import gmpy2   

def permutation(l, index):
    if not index:
        return l

    counter = Counter(l)
    total_count = gmpy2.comb(len(l), counter['1'])
    acc = 0
    for i, v in enumerate(l):
        if i > 0 and v == l[i-1]:
            continue
        count = total_count * counter[v] / len(l)

        if acc + count > index:
            return [l[i]] + permutation(l[:i] + l[i + 1:], index - acc)
        acc += count

    raise ValueError("Not enough permutations")

l = ['0', '0', '1', '1']
index = 2
print (l, index)
   --> result = [0, 1, 1, 0]

Thanks in advance.

回答1:

Let's think:

For n bits with k ones there are n choose k anagrams.

For each position, p, that the i`th left-most set-bit can occupy there are 
p choose (k-i) anagrams, for example:

n = 4, k = 2, i = 1 (left-most set-bit), position 1 => 001x => 1 choose 1 = 1
n = 4, k = 2, i = 1 (left-most set-bit), position 2 => 01xx => 2 choose 1 = 2

Given index 3 (non zero-based), we calculate the position of the 
left-most set-bit:

position 1, 1 choose (2-1) = 1 anagram, index 1
position 2, 2 choose (2-1) = 2 anagrams, index 2-3

We now know the left-most set-bit must be on position 2 and we know there 
are 2 anagrams possible. 

We look at the next set-bit (i = 2):
position 0, 0 choose (2-2) = 1 anagram, index 2
position 1, 1 choose (2-2) = 1 anagram, index 3

Therefore the second set-bit is in position 1 => 0110

I think this might be O(n*k) - I hope someone can understand/explain the
complexity better and perhaps improve/optimize this algorithm idea.

回答2:

Given a permutations of N 0s and M 1s, we need to find the permutation with index K

We know that the number of permutations starting with 0 is equal to the number of permutations of N-1 0s and M 1s, let's call it K0.

if K > K0 =>  The permutation starts with 1, K remains the same
if k <= K0 => The permutation starts with 0, remove K0 from K

Fix the first bit and start again with K = K - K0 and the correct number of 0s and 1s.

This algorithm runs in O(n) where n is the number of bits (and not the length of the list).

In order to simplify the calculations, we assume a 1 based index (starting at 1)

Example:

n = xxxx
l = [0, 0, 1, 1]
K = 2 => 3
Number of permutations starting with 0: K0 = 3! / (2! * 1!) = 3
K <= K0 => first bit is a 0

n = 0xxx
l = [0, 1, 1]
K = K = 3
Number of permutations starting with 0: K0 = 2! / (2! * 0!) = 1
K > K0 => first bit is a 1

n = 01xx
l = [0, 1]
K = K - K0 = 2
Number of permutations starting with 0: K0 = 1! / (1! * 0!) = 1
K > K0 => first bit is a 1

n = 011x
l = [0]
K = K - K0 = 1
Number of permutations starting with 0: K0 = 1! / (0! * 0!) = 1
K <= K0 => first bit is a 0

n = 0110 Which is verified in your example.

Implementing this algorithm can be tricky, make sure to handle correctly the case where the whole list is composed of only 0s or 1s. Also computing the factorials might take sometime (and cause overflow in other languages), but it's possible to precompute them.

回答3:

Some ideas how you can try to attack this problem.

Here is a simple program to print all permutations:

import sys

oneBits = int(sys.argv[1])
totalLen = int(sys.argv[2])

low = 2**oneBits-1
end = 2**totalLen

print 'oneBits:',oneBits
print 'totalLen:',totalLen
print 'Range:',low,'-',end
print
format = '{0:0%db}' % totalLen
index = 0
print 'Index Pattern Value'
for i in range(low,end):
    val = format.format(i)
    if val.count('1') == oneBits:
        print '%5d %s %5d' % (index,val,i)
        index += 1

As you can see, it works purely on bit operations (well, I'm cheating a bit when counting the 1 bits :-)

When you run it with various inputs, you'll see that the input has patterns:

oneBits: 2
totalLen: 5
Range: 3 - 32

Index Pattern Value
    0 00011     3
    1 00101     5
    2 00110     6  <-- pure shift
    3 01001     9
    4 01010    10
    5 01100    12  <-- pure shift
    6 10001    17
    7 10010    18
    8 10100    20
    9 11000    24  <-- pure shift

So my first approach would be to find out the indexes where these pure shifts happen. The distances depend only on the number of 0 and 1 bits. Since the sum is always 1024, that means you should be able to pre-calculate those spots and store the results in a table with 1024 entries. This will get you close to the spot you want to be.

回答4:

Based on the idea of Samy Arous, I change his alg a bit:

if K >= K0 => The permutation starts with 1, K = K - K0
if K < K0  => The permutation starts with 0, K remains the same

The following is my code:

import gmpy2

def find_permutation (lst, K, numberbit1, numberbit0):
    l = lst
    N = numberbit0
    M = numberbit1

    if N == len(l):
        return '1' * N
    if M == len(l):
        return '1' * M

    result = ''    
    for i in range (0, len(lst)-1):
        K0 = gmpy2.comb(len(l)-1, M)
        if (K < K0):
            result += '0'
            l.remove ('0')
        else:
            result += '1'
            l.remove ('1')
            M -=1
            K = K - K0
    result += l[0]
    return result

lst = ['0','1','1', '1']
K = 1
numberbit1 = 3
numberbit0 = 1
print find_permutation (lst, K, numberbit1, numberbit0)
        --> result = '1011'

Thank you. Though it is O(n) x (the complexity of gmpy2.comb), it is better than the alg in my question.

来源：https://stackoverflow.com/questions/24752016/finding-permutation-of-a-set-of-0-and-1-given-index-with-on

标签

python

algorithm

permutation