Returning i-th combination of a bit array

喜欢而已 提交于 2019-12-06 06:11:19

问题


Given a bit array of fixed length and the number of 0s and 1s it contains, how can I arrange all possible combinations such that returning the i-th combinations takes the least possible time? It is not important the order in which they are returned.

Here is an example:

array length = 6
number of 0s = 4
number of 1s = 2

possible combinations (6! / 4! / 2!) 000011 000101 000110 001001 001010 001100 010001 010010 010100 011000 100001 100010 100100 101000 110000

problem 1st combination = 000011 5th combination = 001010 9th combination = 010100

With a different arrangement such as 100001 100010 100100 101000 110000 001100 010001 010010 010100 011000 000011 000101 000110 001001 001010

it shall return 1st combination = 100001 5th combination = 110000 9th combination = 010100

Currently I am using a O(n) algorithm which tests for each bit whether it is a 1 or 0. The problem is I need to handle lots of very long arrays (in the order of 10000 bits), and so it is still very slow (and caching is out of the question). I would like to know if you think a faster algorithm may exist.

Thank you


回答1:


I'm not sure I understand the problem, but if you only want the i-th combination without generating the others, here is a possible algorithm:

There are C(M,N)=M!/(N!(M-N)!) combinations of N bits set to 1 having at most highest bit at position M.

You want the i-th: you iteratively increment M until C(M,N)>=i

while( C(M,N) < i ) M = M + 1

That will tell you the highest bit that is set. Of course, you compute the combination iteratively with

C(M+1,N) = C(M,N)*(M+1)/(M+1-N)

Once found, you have a problem of finding (i-C(M-1,N))th combination of N-1 bits, so you can apply a recursion in N...
Here is a possible variant with D=C(M+1,N)-C(M,N), and I=I-1 to make it start at zero

SOL=0
I=I-1
while(N>0)
    M=N
    C=1
    D=1
    while(i>=D)
        i=i-D
        M=M+1
        D=N*C/(M-N)
        C=C+D
    SOL=SOL+(1<<(M-1))
    N=N-1
RETURN SOL

This will require large integer arithmetic if you have that many bits...




回答2:


If the ordering doesn't matter (it just needs to remain consistent), I think the fastest thing to do would be to have combination(i) return anything you want that has the desired density the first time combination() is called with argument i. Then store that value in a member variable (say, a hashmap that has the value i as key and the combination you returned as its value). The second time combination(i) is called, you just look up i in the hashmap, figure out what you returned before and return it again.

Of course, when you're returning the combination for argument(i), you'll need to make sure it's not something you have returned before for some other argument.

If the number you will ever be asked to return is significantly smaller than the total number of combinations, an easy implementation for the first call to combination(i) would be to make a value of the right length with all 0s, randomly set num_ones of the bits to 1, and then make sure it's not one you've already returned for a different value of i.




回答3:


Your problem appears to be constrained by the binomial coefficient. In the example you give, the problem can be translated as follows:

there are 6 items that can be chosen 2 at a time. By using the binomial coefficient, the total number of unique combinations can be calculated as N! / (K! (N - K)!, which for the case of K = 2 simplifies to N(N-1)/2. Plugging 6 in for N, we get 15, which is the same number of combinations that you calculated with 6! / 4! / 2! - which appears to be another way to calculate the binomial coefficient that I have never seen before. I have tried other combinations as well and both formulas generate the same number of combinations. So, it looks like your problem can be translated to a binomial coefficient problem.

Given this, it looks like you might be able to take advantage of a class that I wrote to handle common functions for working with the binomial coefficient:

  1. Outputs all the K-indexes in a nice format for any N choose K to a file. The K-indexes can be substituted with more descriptive strings or letters. This method makes solving this type of problem quite trivial.

  2. Converts the K-indexes to the proper index of an entry in the sorted binomial coefficient table. This technique is much faster than older published techniques that rely on iteration. It does this by using a mathematical property inherent in Pascal's Triangle. My paper talks about this. I believe I am the first to discover and publish this technique, but I could be wrong.

  3. Converts the index in a sorted binomial coefficient table to the corresponding K-indexes.

  4. Uses Mark Dominus method to calculate the binomial coefficient, which is much less likely to overflow and works with larger numbers.

  5. The class is written in .NET C# and provides a way to manage the objects related to the problem (if any) by using a generic list. The constructor of this class takes a bool value called InitTable that when true will create a generic list to hold the objects to be managed. If this value is false, then it will not create the table. The table does not need to be created in order to perform the 4 above methods. Accessor methods are provided to access the table.

  6. There is an associated test class which shows how to use the class and its methods. It has been extensively tested with 2 cases and there are no known bugs.

To read about this class and download the code, see Tablizing The Binomial Coeffieicent.

It should not be hard to convert this class to the language of your choice.

There may be some limitations since you are using a very large N that could end up creating larger numbers than the program can handle. This is especially true if K can be large as well. Right now, the class is limited to the size of an int. But, it should not be hard to update it to use longs.



来源:https://stackoverflow.com/questions/12019172/returning-i-th-combination-of-a-bit-array

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!