Given a combination of k of the first n natural numbers, for some reason I need to find the position of such combination among those returned by
I dug up some old (although it's been converted to Python 3 syntax) code that includes the function combination_index which does what you request:
def fact(n, _f=[1, 1, 2, 6, 24, 120, 720]):
"""Return n!
The “hidden” list _f acts as a cache"""
try:
return _f[n]
except IndexError:
while len(_f) <= n:
_f.append(_f[-1] * len(_f))
return _f[n]
def indexed_combination(n: int, k: int, index: int) -> tuple:
"""Select the 'index'th combination of k over n
Result is a tuple (i | i∈{0…n-1}) of length k
Note that if index ≥ binomial_coefficient(n,k)
then the result is almost always invalid"""
result= []
for item, n in enumerate(range(n, -1, -1)):
pivot= fact(n-1)//fact(k-1)//fact(n-k)
if index < pivot:
result.append(item)
k-= 1
if k <= 0: break
else:
index-= pivot
return tuple(result)
def combination_index(combination: tuple, n: int) -> int:
"""Return the index of combination (length == k)
The combination argument should be a sorted sequence (i | i∈{0…n-1})"""
k= len(combination)
index= 0
item_in_check= 0
n-= 1 # to simplify subsequent calculations
for offset, item in enumerate(combination, 1):
while item_in_check < item:
index+= fact(n-item_in_check)//fact(k-offset)//fact(n+offset-item_in_check-k)
item_in_check+= 1
item_in_check+= 1
return index
def test():
for n in range(1, 11):
for k in range(1, n+1):
max_index= fact(n)//fact(k)//fact(n-k)
for i in range(max_index):
comb= indexed_combination(n, k, i)
i2= combination_index(comb, n)
if i2 != i:
raise RuntimeError("mismatching n:%d k:%d i:%d≠%d" % (n, k, i, i2))
indexed_combination does the inverse operation.
PS I remember that I sometime attempted removing all those fact calls (by substituting appropriate incremental multiplications and divisions) but the code became much more complicated and wasn't actually faster. A speedup was achievable if I substituted a pre-calculated list of factorials for the fact function, but again the speed difference was negligible for my use cases, so I kept this version.