Efficient way of generating latin squares (or randomly permute numbers in matrix uniquely on both axes - using NumPy)

前端 未结 5 1647
悲&欢浪女
悲&欢浪女 2021-01-05 00:31

For example, if there are 5 numbers 1, 2, 3, 4, 5

I want a random result like

[[ 2, 3, 1, 4, 5]
 [ 5, 1, 2, 3, 4]
 [ 3, 2, 4, 5, 1]
 [ 1, 4, 5, 2, 3]         


        
5条回答
  •  佛祖请我去吃肉
    2021-01-05 01:08

    Just for your information, what you are looking for is a way of generating latin squares. As for the solution, it depends on how much random "random" is for you.

    I would devise at least four main techniques, two of which have been already proposed. Hence, I will briefly describe the other two:

    1. loop through all possible permutations of the items and accept the first that satisfy the unicity constraint along rows
    2. use only cyclic permutations to build subsequent rows: these are by construction satisfying the unicity constraint along rows (the cyclic transformation can be done forward or backward); for improved "randomness" the rows can be shuffled

    Assuming we work with standard Python data types since I do not see a real merit in using NumPy (but results can be easily converted to np.ndarray if necessary), this would be in code (the first function is just to check that the solution is actually correct):

    import random
    import math
    import itertools
    
    # this only works for Iterable[Iterable]
    def is_latin_rectangle(rows):
        valid = True
        for row in rows:
            if len(set(row)) < len(row):
                valid = False
        if valid and rows:
            for i, val in enumerate(rows[0]):
                col = [row[i] for row in rows]
                if len(set(col)) < len(col):
                    valid = False
                    break
        return valid
    
    def is_latin_square(rows):
        return is_latin_rectangle(rows) and len(rows) == len(rows[0])
    
    # : prepare the input
    n = 9
    items = list(range(1, n + 1))
    # shuffle items
    random.shuffle(items)
    # number of permutations
    print(math.factorial(n))
    
    
    def latin_square1(items, shuffle=True):
        result = []
        for elems in itertools.permutations(items):
            valid = True
            for i, elem in enumerate(elems):
                orthogonals = [x[i] for x in result] + [elem]
                if len(set(orthogonals)) < len(orthogonals):
                    valid = False
                    break
            if valid:
                result.append(elems)
        if shuffle:
            random.shuffle(result)
        return result
    
    rows1 = latin_square1(items)
    for row in rows1:
        print(row)
    print(is_latin_square(rows1))
    
    
    def latin_square2(items, shuffle=True, forward=False):
        sign = -1 if forward else 1
        result = [items[sign * i:] + items[:sign * i] for i in range(len(items))]
        if shuffle:
            random.shuffle(result)
        return result
    
    rows2 = latin_square2(items)
    for row in rows2:
        print(row)
    print(is_latin_square(rows2))
    
    rows2b = latin_square2(items, False)
    for row in rows2b:
        print(row)
    print(is_latin_square(rows2b))
    

    For comparison, an implementation by trying random permutations and accepting valid ones (fundamentally what @hpaulj proposed) is also presented.

    def latin_square3(items):
        result = [list(items)]
        while len(result) < len(items):
            new_row = list(items)
            random.shuffle(new_row)
            result.append(new_row)
            if not is_latin_rectangle(result):
                result = result[:-1]
        return result
    
    rows3 = latin_square3(items)
    for row in rows3:
        print(row)
    print(is_latin_square(rows3))
    

    I did not have time (yet) to implement the other method (with backtrack Sudoku-like solutions from @ConfusedByCode).

    With timings for n = 5:

    %timeit latin_square1(items)
    321 µs ± 24.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    
    %timeit latin_square2(items)
    7.5 µs ± 222 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
    
    %timeit latin_square2(items, False)
    2.21 µs ± 69.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
    
    %timeit latin_square3(items)
    2.15 ms ± 102 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    

    ... and for n = 9:

    %timeit latin_square1(items)
    895 ms ± 18.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    
    %timeit latin_square2(items)
    12.5 µs ± 200 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
    
    %timeit latin_square2(items, False)
    3.55 µs ± 55.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
    
    %timeit latin_square3(items)
    The slowest run took 36.54 times longer than the fastest. This could mean that an intermediate result is being cached.
    9.76 s ± 9.23 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
    

    So, solution 1 is giving a fair deal of randomness but it is not terribly fast (and scale with O(n!)), solution 2 (and 2b) are much faster (scaling with O(n)) but not as random as solution 1. Solution 3 is very slow and the performance can vary significantly (can probably be sped up by letting the last iteration be computed instead of guessed).

    Getting more technical, other efficient algorithms are discussed in:

    • Jacobson, M. T. and Matthews, P. (1996), Generating uniformly distributed random latin squares. J. Combin. Designs, 4: 405-437. doi:10.1002/(SICI)1520-6610(1996)4:6<405::AID-JCD3>3.0.CO;2-J

提交回复
热议问题