Returning i-th combination of a bit array

放肆的年华 提交于 2019-12-04 10:15:42

I'm not sure I understand the problem, but if you only want the i-th combination without generating the others, here is a possible algorithm:

There are C(M,N)=M!/(N!(M-N)!) combinations of N bits set to 1 having at most highest bit at position M.

You want the i-th: you iteratively increment M until C(M,N)>=i

while( C(M,N) < i ) M = M + 1

That will tell you the highest bit that is set. Of course, you compute the combination iteratively with

C(M+1,N) = C(M,N)*(M+1)/(M+1-N)

Once found, you have a problem of finding (i-C(M-1,N))th combination of N-1 bits, so you can apply a recursion in N...
Here is a possible variant with D=C(M+1,N)-C(M,N), and I=I-1 to make it start at zero

SOL=0
I=I-1
while(N>0)
    M=N
    C=1
    D=1
    while(i>=D)
        i=i-D
        M=M+1
        D=N*C/(M-N)
        C=C+D
    SOL=SOL+(1<<(M-1))
    N=N-1
RETURN SOL

This will require large integer arithmetic if you have that many bits...

If the ordering doesn't matter (it just needs to remain consistent), I think the fastest thing to do would be to have combination(i) return anything you want that has the desired density the first time combination() is called with argument i. Then store that value in a member variable (say, a hashmap that has the value i as key and the combination you returned as its value). The second time combination(i) is called, you just look up i in the hashmap, figure out what you returned before and return it again.

Of course, when you're returning the combination for argument(i), you'll need to make sure it's not something you have returned before for some other argument.

If the number you will ever be asked to return is significantly smaller than the total number of combinations, an easy implementation for the first call to combination(i) would be to make a value of the right length with all 0s, randomly set num_ones of the bits to 1, and then make sure it's not one you've already returned for a different value of i.

Your problem appears to be constrained by the binomial coefficient. In the example you give, the problem can be translated as follows:

there are 6 items that can be chosen 2 at a time. By using the binomial coefficient, the total number of unique combinations can be calculated as N! / (K! (N - K)!, which for the case of K = 2 simplifies to N(N-1)/2. Plugging 6 in for N, we get 15, which is the same number of combinations that you calculated with 6! / 4! / 2! - which appears to be another way to calculate the binomial coefficient that I have never seen before. I have tried other combinations as well and both formulas generate the same number of combinations. So, it looks like your problem can be translated to a binomial coefficient problem.

Given this, it looks like you might be able to take advantage of a class that I wrote to handle common functions for working with the binomial coefficient:

  1. Outputs all the K-indexes in a nice format for any N choose K to a file. The K-indexes can be substituted with more descriptive strings or letters. This method makes solving this type of problem quite trivial.

  2. Converts the K-indexes to the proper index of an entry in the sorted binomial coefficient table. This technique is much faster than older published techniques that rely on iteration. It does this by using a mathematical property inherent in Pascal's Triangle. My paper talks about this. I believe I am the first to discover and publish this technique, but I could be wrong.

  3. Converts the index in a sorted binomial coefficient table to the corresponding K-indexes.

  4. Uses Mark Dominus method to calculate the binomial coefficient, which is much less likely to overflow and works with larger numbers.

  5. The class is written in .NET C# and provides a way to manage the objects related to the problem (if any) by using a generic list. The constructor of this class takes a bool value called InitTable that when true will create a generic list to hold the objects to be managed. If this value is false, then it will not create the table. The table does not need to be created in order to perform the 4 above methods. Accessor methods are provided to access the table.

  6. There is an associated test class which shows how to use the class and its methods. It has been extensively tested with 2 cases and there are no known bugs.

To read about this class and download the code, see Tablizing The Binomial Coeffieicent.

It should not be hard to convert this class to the language of your choice.

There may be some limitations since you are using a very large N that could end up creating larger numbers than the program can handle. This is especially true if K can be large as well. Right now, the class is limited to the size of an int. But, it should not be hard to update it to use longs.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!