Given a permutation's lexicographic number, is it possible to get any item in it in O(1)

问题

I want to know whether the task explained below is even theoretically possible, and if so how I could do it.

You are given a space of N elements (i.e. all numbers between 0 and N-1.) Let's look at the space of all permutations on that space, and call it S. The ith member of S, which can be marked S[i], is the permutation with the lexicographic number i.

For example, if N is 3, then S is this list of permutations:

S[0]: 0, 1, 2
S[1]: 0, 2, 1
S[2]: 1, 0, 2
S[3]: 1, 2, 0
S[4]: 2, 0, 1
S[5]: 2, 1, 0

(Of course, when looking at a big N, this space becomes very large, N! to be exact.)

Now, I already know how to get the permutation by its index number i, and I already know how to do the reverse (get the lexicographic number of a given permutation.) But I want something better.

Some permutations can be huge by themselves. For example, if you're looking at N=10^20. (The size of S would be (10^20)! which I believe is the biggest number I ever mentioned in a Stack Overflow question :)

If you're looking at just a random permutation on that space, it would be so big that you wouldn't be able to store the whole thing on your harddrive, let alone calculate each one of the items by lexicographic number. What I want is to be able to do item access on that permutation, and also get the index of each item. That is, given N and i to specify a permutation, have one function that takes an index number and find the number that resides in that index, and another function that takes a number and finds in which index it resides. I want to do that in O(1), so I don't need to store or iterate over each member in the permutation.

Crazy, you say? Impossible? That may be. But consider this: A block cipher, like AES, is essentially a permutation, and it almost accomplishes the tasks I outlined above. AES has a block size of 16 bytes, meaning that N is 256^16 which is around 10^38. (The size of S, not that it matters, is a staggering (256^16)!, or around 10^85070591730234615865843651857942052838, which beats my recent record for "biggest number mentioned on Stack Overflow" :)

Each AES encryption key specifies a single permutation on N=256^16. That permutation couldn't be stored whole on your computer, because it has more members than there are atoms in the solar system. But, it allows you item access. By encrypting data using AES, you're looking at the data block by block, and for each block (member of range(N)) you output the encrypted block, which the member of range(N) that is in the index number of the original block in the permutation. And when you're decrypting, you're doing the reverse (Finding the index number of a block.) I believe this is done in O(1), I'm not sure but in any case it's very fast.

The problem with using AES or any other block cipher is that it limits you to very specific N, and it probably only captures a tiny fraction of the possible permutations, while I want to be able to use any N I like, and do item access on any permutation S[i] that I like.

Is it possible to get O(1) item access on a permutation, given size N and permutation number i? If so, how?

(If I'm lucky enough to get code answers here, I'd appreciate if they'll be in Python.)

UPDATE:

Some people pointed out the sad fact that the permutation number itself would be so huge, that just reading the number would make the task non-feasible. Then, I'd like to revise my question: Given access to the factoradic representation of a permutation's lexicographic number, is it possible to get any item in the permutation in O(as small as possible)?

回答1:

The secret to doing this is to "count in base factorial".

In the same way that 134 = 1*10^2+3*10 + 4, 134 = 5! + 2 * 3! + 2! => 10210 in factorial notation (include 1!, exclude 0!). If you want to represent N!, you will then need N^2 base ten digits. (For each factorial digit N, the maximum number it can hold is N). Up to a bit of confusion about what you call 0, this factorial representation is exactly the lexicographic number of a permutation.

You can use this insight to solve Euler Problem 24 by hand. So I will do that here, and you will see how to solve your problem. We want the millionth permutation of 0-9. In factorial representation we take 1000000 => 26625122. Now to convert that to the permutation, I take my digits 0,1,2,3,4,5,6,7,8,9, and The first number is 2, which is the third (it could be 0), so I select 2 as the first digit, then I have a new list 0,1,3,4,5,6,7,8,9 and I take the seventh number which is 8 etc, and I get 2783915604.

However, this assumes that you start your lexicographic ordering at 0, if you actually start it at one, you have to subtract 1 from it, which gives 2783915460. Which is indeed the millionth permutation of the numbers 0-9.

You can obviously reverse this procedure, and hence convert backwards and forwards easily between the lexiographic number and the permutation that it represents.

I am not entirely clear what it is that you want to do here, but understanding the above procedure should help. For example, its clear that the lexiographic number represents an ordering which could be used as the key in a hashtable. And you can order numbers by comparing digits left to right so once you have inserted a number you never have to work outs it factorial.

回答2:

Your question is a bit moot, because your input size for an arbitrary permutation index has size log(N!) (assuming you want to represent all possible permutations) which is Theta(N log N), so if N is really large then just reading the input of the permutation index would take too long, certainly much longer than O(1). It may be possible to store the permutation index in such a way that if you already had it stored, then you could access elements in O(1) time. But probably any such method would be equivalent to just storing the permutation in contiguous memory (which also has Theta(N log N) size), and if you store the permutation directly in memory then the question becomes trivial assuming you can do O(1) memory access. (However you still need to account for the size of the bit encoding of the element, which is O(log N)).

In the spirit of your encryption analogy, perhaps you should specify a small SUBSET of permutations according to some property, and ask if O(1) or O(log N) element access is possible for that small subset.

回答3:

Edit:

I misunderstood the question, but it was not in waste. My algorithms let me understand: the factoradic representation of a permutation's lexicographic number is almost the same as the permutation itself. In fact the first digit of the factoradic representation is the same as the first element of the corresponding permutation (assuming your space consists of numbers from 0 to N-1). Knowing this there is not really a point in storing the index rather than the permutation itself . To see how to convert the lexicographic number into a permutation, read below. See also this wikipedia link about Lehmer code.

Original post:

In the S space there are N elements that can fill the first slot, meaning that there are (N-1)! elements that start with 0. So i/(N-1)! is the first element (lets call it 'a'). The subset of S that starts with 0 consists of (N-1)! elements. These are the possible permutations of the set N{a}. Now you can get the second element: its the i(%((N-1)!)/(N-2)!). Repeat the process and you got the permutation.

Reverse is just as simple. Start with i=0. Get the 2nd last element of the permutation. Make a set of the last two elements, and find the element's position in it (its either the 0th element or the 1st), lets call this position j. Then i+=j*2!. Repeat the process (you can start with the last element too, but it will always be the 0th element of the possibilities).

Java-ish pesudo code:

find_by_index(List N, int i){
    String str = "";
    for(int l = N.length-1; i >= 0; i--){
        int pos = i/fact(l);
        str += N.get(pos);
        N.remove(pos);
        i %= fact(l);
    }
    return str;
}

find_index(String str){
    OrderedList N;
    int i = 0;
    for(int l = str.length-1; l >= 0; l--){
        String item = str.charAt(l);
        int pos = N.add(item);
        i += pos*fact(str.length-l)
    }
    return i;
}

find_by_index should run in O(n) assuming that N is pre ordered, while find_index is O(n*log(n)) (where n is the size of the N space)

回答4:

After some research in Wikipedia, I desgined this algorithm:

def getPick(fact_num_list):
    """fact_num_list should be a list with the factorial number representation, 
    getPick will return a tuple"""
    result = [] #Desired pick
    #This will hold all the numbers pickable; not actually a set, but a list
    #instead
    inputset = range(len(fact_num_list)) 
    for fnl in fact_num_list:
        result.append(inputset[fnl])
        del inputset[fnl] #Make sure we can't pick the number again
    return tuple(result)

Obviously, this won't reach O(1) due the factor we need to "pick" every number. Due we do a for loop and thus, assuming all operations are O(1), getPick will run in O(n).

If we need to convert from base 10 to factorial base, this is an aux function:

import math

def base10_baseFactorial(number):
    """Converts a base10 number into a factorial base number. Output is a list
    for better handle of units over 36! (after using all 0-9 and A-Z)"""
    loop = 1
    #Make sure n! <= number
    while math.factorial(loop) <= number:
        loop += 1
    result = []
    if not math.factorial(loop) == number:
        loop -= 1 #Prevent dividing over a smaller number than denominator
    while loop > 0:
        denominator = math.factorial(loop)
        number, rem = divmod(number, denominator)
        result.append(rem)
        loop -= 1
    result.append(0) #Don't forget to divide to 0! as well!
    return result

Again, this will run in O(n) due to the whiles.

Summing all, the best time we can find is O(n).

PS: I'm not a native English speaker, so spelling and phrasing errors may appear. Apologies in advance, and let me know if you can't get around something.

回答5:

All correct algorithms for accessing the kth item of a permutation stored in factoradic form must read the first k digits. This is because, regardless of the values of the other digits among the first k, it makes a difference whether an unread digit is a 0 or takes on its maximum value. That this is the case can be seen by tracing the canonical correct decoding program in two parallel executions.

For example, if we want to decode the third digit of the permutation 1?0, then for 100, that digit is 0, and for 110, that digit is 2.

来源：https://stackoverflow.com/questions/23502180/given-a-permutations-lexicographic-number-is-it-possible-to-get-any-item-in-it

标签

algorithm

encryption

permutation

combinatorics

number-theory