How to calculate the index (lexicographical order) when the combination is given

萝らか妹 提交于 2019-11-27 03:41:05

Here is a snippet that will do the work.

#include <iostream>

int main()
{
    const int n = 10;
    const int k = 5;

    int combination[k] = {2, 5, 7, 8, 10};

    int index = 0;
    int j = 0;
    for (int i = 0; i != k; ++i)
    {
        for (++j; j != combination[i]; ++j)
        {
            index += c(n - j, k - i - 1);
        }
    }

    std::cout << index + 1 << std::endl;

    return 0;
}

It assumes you have a function

int c(int n, int k);

that will return the number of combinations of choosing k elements out of n elements. The loop calculates the number of combinations preceding the given combination. By adding one at the end we get the actual index.

For the given combination there are c(9, 4) = 126 combinations containing 1 and hence preceding it in lexicographic order.

Of the combinations containing 2 as the smallest number there are

c(7, 3) = 35 combinations having 3 as the second smallest number

c(6, 3) = 20 combinations having 4 as the second smallest number

All of these are preceding the given combination.

Of the combinations containing 2 and 5 as the two smallest numbers there are

c(4, 2) = 6 combinations having 6 as the third smallest number.

All of these are preceding the given combination.

Etc.

If you put a print statement in the inner loop you will get the numbers 126, 35, 20, 6, 1. Hope that explains the code.

Convert your number selections to a factorial base number. This number will be the index you want. Technically this calculates the lexicographical index of all permutations, but if you only give it combinations, the indexes will still be well ordered, just with some large gaps for all the permutations that come in between each combination.

Edit: pseudocode removed, it was incorrect, but the method above should work. Too tired to come up with correct pseudocode at the moment.

Edit 2: Here's an example. Say we were choosing a combination of 5 elements from a set of 10 elements, like in your example above. If the combination was 2 3 4 6 8, you would get the related factorial base number like so:

Take the unselected elements and count how many you have to pass by to get to the one you are selecting.

1 2 3 4 5 6 7 8 9 10
2 -> 1
1 3 4 5 6 7 8 9 10
3 -> 1
1 4 5 6 7 8 9 10
4 -> 1
1 5 6 7 8 9 10
6 -> 2
1 5 7 8 9 10
8 -> 3

So the index in factorial base is 1112300000

In decimal base, it's

1*9! + 1*8! + 1*7! + 2*6! + 3*5! = 410040

This is Algorithm 2.7 kSubsetLexRank on page 44 of Combinatorial Algorithms by Kreher and Stinson.

r = 0
t[0] = 0
for i from 1 to k
    if t[i - 1] + 1 <= t[i] - 1
        for j from t[i - 1] to t[i] - 1
            r = r + choose(n - j, k - i)
return r

The array t holds your values, for example [5 7 8 9 10]. The function choose(n, k) calculates the number "n choose k". The result value r will be the index, 251 for the example. Other inputs are n and k, for the example they would be 10 and 5.

zero-base,

# v: array of length k consisting of numbers between 0 and n-1 (ascending)
def index_of_combination(n,k,v):
    idx = 0
    for p in range(k-1):
        if p == 0: arrg = range(1,v[p]+1)
        else: arrg = range(v[p-1]+2, v[p]+1)
        for a in arrg:
            idx += combi[n-a, k-1-p]
    idx += v[k-1] - v[k-2] - 1
    return idx

Null Set has the right approach. The index corresponds to the factorial-base number of the sequence. You build a factorial-base number just like any other base number, except that the base decreases for each digit.

Now, the value of each digit in the factorial-base number is the number of elements less than it that have not yet been used. So, for combination(10, 5):

(1 2 3 4 5) == 0*9!/5! + 0*8!/5! + 0*7!/5! + 0*6!/5! + 0*5!/5!
            == 0*3024 + 0*336 + 0*42 + 0*6 + 0*1
            == 0

(10 9 8 7 6) == 9*3024 + 8*336 + 7*42 + 6*6 + 5*1
             == 30239

It should be pretty easy to calculate the index incrementally.

I needed also the same for a project of mine and the fastest solution I found was (Python):

import math

def nCr(n,r):
    f = math.factorial
    return f(n) / f(r) / f(n-r)

def index(comb,n,k):
    r=nCr(n,k)
    for i in range(k):
        if n-comb[i]<k-i:continue
        r=r-nCr(n-comb[i],k-i)
    return r

My input "comb" contained elements in increasing order You can test the code with for example:

import itertools
k=3
t=[1,2,3,4,5]
for x in itertools.combinations(t, k):
    print x,index(x,len(t),k)

It is not hard to prove that if comb=(a1,a2,a3...,ak) (in increasing order) then:

index=[nCk-(n-a1+1)Ck] + [(n-a1)C(k-1)-(n-a2+1)C(k-1)] + ... = nCk -(n-a1)Ck -(n-a2)C(k-1) - .... -(n-ak)C1

Hellius666

There's another way to do all this. You could generate all possible combinations and write them into a binary file where each comb is represented by it's index starting from zero. Then, when you need to find an index, and the combination is given, you apply a binary search on the file. Here's the function. It's written in VB.NET 2010 for my lotto program, it works with Israel lottery system so there's a bonus (7th) number; just ignore it.

Public Function Comb2Index( _
ByVal gAr() As Byte) As UInt32
 Dim mxPntr As UInt32 = WHL.AMT.WHL_SYS_00     '(16.273.488)
 Dim mdPntr As UInt32 = mxPntr \ 2
 Dim eqCntr As Byte
 Dim rdAr() As Byte

    modBinary.OpenFile(WHL.WHL_SYS_00, _
    FileMode.Open, FileAccess.Read)

    Do
    modBinary.ReadBlock(mdPntr, rdAr)
RP: If eqCntr = 7 Then GoTo EX

        If gAr(eqCntr) = rdAr(eqCntr) Then
           eqCntr += 1
           GoTo RP

        ElseIf gAr(eqCntr) < rdAr(eqCntr) Then
            If eqCntr > 0 Then eqCntr = 0
               mxPntr = mdPntr
               mdPntr \= 2

        ElseIf gAr(eqCntr) > rdAr(eqCntr) Then
            If eqCntr > 0 Then eqCntr = 0
            mdPntr += (mxPntr - mdPntr) \ 2
        End If

    Loop Until eqCntr = 7

EX: modBinary.CloseFile()
    Return mdPntr

End Function

P.S. It takes 5 to 10 mins to generate 16 million combs on a Core 2 Duo. To find the index using binary search on file takes 397 milliseconds on a SATA drive.

Assuming the maximum setSize is not too large, you can simply generate a lookup table, where the inputs are encoded this way:

  int index(a,b,c,...)
  {
      int key = 0;
      key |= 1<<a;
      key |= 1<<b;
      key |= 1<<c;
      //repeat for all arguments
      return Lookup[key];
  } 

To generate the lookup table, look at this "banker's order" algorithm. Generate all the combinations, and also store the base index for each nItems. (For the example on p6, this would be [0,1,5,11,15]). Note that by you storing the answers in the opposite order from the example (LSBs set first) you will only need one table, sized for the largest possible set.

Populate the lookup table by walking through the combinations doing Lookup[combination[i]]=i-baseIdx[nItems]

EDIT: Never mind. This is completely wrong.


Let your combination be (a1, a2, ..., ak-1, ak) where a1 < a2 < ... < ak. Let choose(a,b) = a!/(b!*(a-b)!) if a >= b and 0 otherwise. Then, the index you are looking for is

choose(ak-1, k) + choose(ak-1-1, k-1) + choose(ak-2-1, k-2) + ... + choose (a2-1, 2) + choose (a1-1, 1) + 1

The first term counts the number of k-element combinations such that the largest element is less than ak. The second term counts the number of (k-1)-element combinations such that the largest element is less than ak-1. And, so on.

Notice that the size of the universe of elements to be chosen from (10 in your example) does not play a role in the computation of the index. Can you see why?

If you have a set of positive integers 0<=x_1 < x_2< ... < x_k , then you could use something called the squashed order:

I = sum(j=1..k) Choose(x_j,j)

The beauty of the squashed order is that it works independent of the largest value in the parent set.

The squashed order is not the order you are looking for, but it is related.

To use the squashed order to get the lexicographic order in the set of k-subsets of {1,...,n) is by taking

1 <= x1 < ... < x_k <=n

compute

 0 <= n-x_k < n-x_(k-1) ... < n-x_1

Then compute the squashed order index of (n-x_k,...,n-k_1)

Then subtract the squashed order index from Choose(n,k) to get your result, which is the lexicographic index.

If you have relatively small values of n and k, you can cache all the values Choose(a,b) with a

See Anderson, Combinatorics on Finite Sets, pp 112-119

    0

    Sample solution:

    class Program
    {
        static void Main(string[] args)
        {
            // The input
            var n = 5;
            var t = new[] { 2, 4, 5 };
    
            // Helping transformations
            ComputeDistances(t);
            CorrectDistances(t);
    
            // The algorithm
            var r = CalculateRank(t, n);
    
            Console.WriteLine("n = 5");
            Console.WriteLine("t = {2, 4, 5}");
            Console.WriteLine("r = {0}", r);
    
            Console.ReadKey();
        }
    
        static void ComputeDistances(int[] t)
        {
            var k = t.Length;
            while (--k >= 0)
                t[k] -= (k + 1);
        }
    
        static void CorrectDistances(int[] t)
        {
            var k = t.Length;
            while (--k > 0)
                t[k] -= t[k - 1];
        }
    
        static int CalculateRank(int[] t, int n)
        {
            int k = t.Length - 1, r = 0;
    
            for (var i = 0; i < t.Length; i++)
            {
                if (t[i] == 0)
                {
                    n--;
                    k--;
                    continue;
                }
    
                for (var j = 0; j < t[i]; j++)
                {
                    n--;
                    r += CalculateBinomialCoefficient(n, k);
                }
    
                n--;
                k--;
            }
    
            return r;
        }
    
        static int CalculateBinomialCoefficient(int n, int k)
        {
            int i, l = 1, m, x, y;
    
            if (n - k < k)
            {
                x = k;
                y = n - k;
            }
            else
            {
                x = n - k;
                y = k;
            }
    
            for (i = x + 1; i <= n; i++)
                l *= i;
    
            m = CalculateFactorial(y);
    
            return l/m;
        }
    
        static int CalculateFactorial(int n)
        {
            int i, w = 1;
    
            for (i = 1; i <= n; i++)
                w *= i;
    
            return w;
        }
    }
    

    The idea behind the scenes is to associate a k-subset with an operation of drawing k-elements from the n-size set. It is a combination, so the overall count of possible items will be (n k). It is a clue that we could seek the solution in Pascal Triangle. After a while of comparing manually written examples with the appropriate numbers from the Pascal Triangle, we will find the pattern and hence the algorithm.

      0

      I used user515430's answer and converted to python3. Also this supports non-continuous values so you could pass in [1,3,5,7,9] as your pool instead of range(1,11)

      from itertools import combinations
      from scipy.special import comb
      from pandas import Index
      
      debugcombinations = False
      
      class IndexedCombination:
          def __init__(self, _setsize, _poolvalues):
              self.setsize = _setsize
              self.poolvals = Index(_poolvalues)
              self.poolsize = len(self.poolvals)
              self.totalcombinations = 1
              fast_k = min(self.setsize, self.poolsize - self.setsize)
              for i in range(1, fast_k + 1):
                  self.totalcombinations = self.totalcombinations * (self.poolsize - fast_k + i) // i
      
              #fill the nCr cache
              self.choose_cache = {}
              n = self.poolsize
              k = self.setsize
              for i in range(k + 1):
                  for j in range(n + 1):
                      if n - j >= k - i:
                          self.choose_cache[n - j,k - i] = comb(n - j,k - i, exact=True)
              if debugcombinations:
                  print('testnth = ' + str(self.testnth()))
      
          def get_nth_combination(self,index):
              n = self.poolsize
              r = self.setsize
              c = self.totalcombinations
              #if index < 0 or index >= c:
              #    raise IndexError
              result = []
              while r:
                  c, n, r = c*r//n, n-1, r-1
                  while index >= c:
                      index -= c
                      c, n = c*(n-r)//n, n-1
                  result.append(self.poolvals[-1 - n])
              return tuple(result)
      
          def get_n_from_combination(self,someset):
              n = self.poolsize
              k = self.setsize
              index = 0
              j = 0
              for i in range(k):
                  setidx = self.poolvals.get_loc(someset[i])
                  for j in range(j + 1, setidx + 1):
                      index += self.choose_cache[n - j, k - i - 1]
                  j += 1
              return index
      
          #just used to test whether nth_combination from the internet actually works
          def testnth(self):
              n = 0
              _setsize = self.setsize
              mainset = self.poolvals
              for someset in combinations(mainset, _setsize):
                  nthset = self.get_nth_combination(n)
                  n2 = self.get_n_from_combination(nthset)
                  if debugcombinations:
                      print(str(n) + ': ' + str(someset) + ' vs ' + str(n2) + ': ' + str(nthset))
                  if n != n2:
                      return False
                  for x in range(_setsize):
                      if someset[x] != nthset[x]:
                          return False
                  n += 1
              return True
      
      setcombination = IndexedCombination(5, list(range(1,10+1)))
      print( str(setcombination.get_n_from_combination([2,5,7,8,10])))
      

      returns 188

        Your Answer

        By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

        Not the answer you're looking for? Browse other questions tagged or ask your own question.

        易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
        该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!