Algorithm for equiprobable random square binary matrices with two non-adjacent non-zeros in each row and column

后端 未结 5 1284
遇见更好的自我
遇见更好的自我 2020-12-09 10:22

It would be great if someone could point me towards an algorithm that would allow me to :

  1. create a random square matrix, with entries 0 and 1, such that
  2. <
5条回答
  •  渐次进展
    2020-12-09 10:52

    (Updated test results, example run-through and code snippets below.)

    You can use dynamic programming to calculate the number of solutions resulting from every state (in a much more efficient way than a brute-force algorithm), and use those (pre-calculated) values to create equiprobable random solutions.

    Consider the example of a 7x7 matrix; at the start, the state is:

    0,0,0,0,0,0,0  
    

    meaning that there are seven adjacent unused columns. After adding two ones to the first row, the state could be e.g.:

    0,1,0,0,1,0,0  
    

    with two columns that now have a one in them. After adding ones to the second row, the state could be e.g.:

    0,1,1,0,1,0,1  
    

    After three rows are filled, there is a possibility that a column will have its maximum of two ones; this effectively splits the matrix into two independent zones:

    1,1,1,0,2,0,1  ->  1,1,1,0 + 0,1  
    

    These zones are independent in the sense that the no-adjacent-ones rule has no effect when adding ones to different zones, and the order of the zones has no effect on the number of solutions.

    In order to use these states as signatures for types of solutions, we have to transform them into a canonical notation. First, we have to take into account the fact that columns with only 1 one in them may be unusable in the next row, because they contain a one in the current row. So instead of a binary notation, we have to use a ternary notation, e.g.:

    2,1,1,0 + 0,1  
    

    where the 2 means that this column was used in the current row (and not that there are 2 ones in the column). At the next step, we should then convert the twos back into ones.

    Additionally, we can also mirror the seperate groups to put them into their lexicographically smallest notation:

    2,1,1,0 + 0,1  ->  0,1,1,2 + 0,1  
    

    Lastly, we sort the seperate groups from small to large, and then lexicographically, so that a state in a larger matrix may be e.g.:

    0,0 + 0,1 + 0,0,2 + 0,1,0 + 0,1,0,1  
    

    Then, when calculating the number of solutions resulting from each state, we can use memoization using the canonical notation of each state as a key.

    Creating a dictionary of the states and the number of solutions for each of them only needs to be done once, and a table for larger matrices can probably be used for smaller matrices too.

    Practically, you'd generate a random number between 0 and the total number of solutions, and then for every row, you'd look at the different states you could create from the current state, look at the number of unique solutions each one would generate, and see which option leads to the solution that corresponds with your randomly generated number.


    Note that every state and the corresponding key can only occur in a particular row, so you can store the keys in seperate dictionaries per row.


    TEST RESULTS

    A first test using unoptimized JavaScript gave very promising results. With dynamic programming, calculating the number of solutions for a 10x10 matrix now takes a second, where a brute-force algorithm took several hours (and this is the part of the algorithm that only needs to be done once). The size of the dictionary with the signatures and numbers of solutions grows with a diminishing factor approaching 2.5 for each step in size; the time to generate it grows with a factor of around 3.

    These are the number of solutions, states, signatures (total size of the dictionaries), and maximum number of signatures per row (largest dictionary per row) that are created:

    size                  unique solutions                  states    signatures    max/row
    
     4x4                                               2            9          6           2
     5x5                                              16           73         26           8
     6x6                                             722          514        107          40
     7x7                                          33,988        2,870        411         152
     8x8                                       2,215,764       13,485      1,411         596
     9x9                                     179,431,924       56,375      4,510       1,983
    10x10                                 17,849,077,140      218,038     13,453       5,672
    11x11                              2,138,979,146,276      801,266     38,314      14,491
    12x12                            304,243,884,374,412    2,847,885    104,764      35,803
    13x13                         50,702,643,217,809,908    9,901,431    278,561      96,414
    14x14                      9,789,567,606,147,948,364   33,911,578    723,306     238,359
    15x15                  2,168,538,331,223,656,364,084  114,897,838  1,845,861     548,409
    16x16                546,386,962,452,256,865,969,596          ...  4,952,501   1,444,487
    17x17            155,420,047,516,794,379,573,558,433              12,837,870   3,754,040
    18x18         48,614,566,676,379,251,956,711,945,475              31,452,747   8,992,972
    19x19     17,139,174,923,928,277,182,879,888,254,495              74,818,773  20,929,008
    20x20  6,688,262,914,418,168,812,086,412,204,858,650             175,678,000  50,094,203
    

    (Additional results obtained with C++, using a simple 128-bit integer implementation. To count the states, the code had to be run using each state as a seperate signature, which I was unable to do for the largest sizes. )


    EXAMPLE

    The dictionary for a 5x5 matrix looks like this:

    row 0:  00000  -> 16        row 3:  101    ->  0
                                        1112   ->  1
    row 1:  20002  ->  2                1121   ->  1
            00202  ->  4                1+01   ->  0
            02002  ->  2                11+12  ->  2
            02020  ->  2                1+121  ->  1
                                        0+1+1  ->  0
    row 2:  10212  ->  1                1+112  ->  1
            12012  ->  1
            12021  ->  2        row 4:  0      ->  0
            12102  ->  1                11     ->  0
            21012  ->  0                12     ->  0
            02121  ->  3                1+1    ->  1
            01212  ->  1                1+2    ->  0
    

    The total number of solutions is 16; if we randomly pick a number from 0 to 15, e.g. 13, we can find the corresponding (i.e. the 14th) solution like this:

    state:      00000  
    options:    10100  10010  10001  01010  01001  00101  
    signature:  00202  02002  20002  02020  02002  00202  
    solutions:    4      2      2      2      2      4  
    

    This tells us that the 14th solution is the 2nd solution of option 00101. The next step is:

    state:      00101  
    options:    10010  01010  
    signature:  12102  02121  
    solutions:    1      3  
    

    This tells us that the 2nd solution is the 1st solution of option 01010. The next step is:

    state:      01111  
    options:    10100  10001  00101  
    signature:  11+12  1112   1+01  
    solutions:    2      1      0  
    

    This tells us that the 1st solution is the 1st solution of option 10100. The next step is:

    state:      11211  
    options:    01010  01001  
    signature:  1+1    1+1  
    solutions:    1      1  
    

    This tells us that the 1st solutions is the 1st solution of option 01010. The last step is:

    state:      12221  
    options:    10001  
    

    And the 5x5 matrix corresponding to randomly chosen number 13 is:

    0 0 1 0 1  
    0 1 0 1 0  
    1 0 1 0 0
    0 1 0 1 0  
    1 0 0 0 1  
    

    And here's a quick'n'dirty code example; run the snippet to generate the signature and solution count dictionary, and generate a random 10x10 matrix (it takes a second to generate the dictionary; once that is done, it generates random solutions in half a millisecond):

    function signature(state, prev) {
        var zones = [], zone = [];
        for (var i = 0; i < state.length; i++) {
            if (state[i] == 2) {
                if (zone.length) zones.push(mirror(zone));
                zone = [];
            }
            else if (prev[i]) zone.push(3);
            else zone.push(state[i]);
        }
        if (zone.length) zones.push(mirror(zone));
        zones.sort(function(a,b) {return a.length - b.length || a - b;});
        return zones.length ? zones.join("2") : "2";
    
        function mirror(zone) {
            var ltr = zone.join('');
            zone.reverse();
            var rtl = zone.join('');
            return (ltr < rtl) ? ltr : rtl;
        }
    }
    
    function memoize(n) {
        var memo = [], empty = [];
        for (var i = 0; i <= n; i++) memo[i] = [];
        for (var i = 0; i < n; i++) empty[i] = 0;
        memo[0][signature(empty, empty)] = next_row(empty, empty, 1);
        return memo;
    
        function next_row(state, prev, row) {
            if (row > n) return 1;
            var solutions = 0;
            for (var i = 0; i < n - 2; i++) {
                if (state[i] == 2 || prev[i] == 1) continue;
                for (var j = i + 2; j < n; j++) {
                    if (state[j] == 2 || prev[j] == 1) continue;
                    var s = state.slice(), p = empty.slice();
                    ++s[i]; ++s[j]; ++p[i]; ++p[j];
                    var sig = signature(s, p);
                    var sol = memo[row][sig];
                    if (sol == undefined) 
                        memo[row][sig] = sol = next_row(s, p, row + 1);
                    solutions += sol;
                }
            }
            return solutions;
        }
    }
    
    function random_matrix(n, memo) {
        var matrix = [], empty = [], state = [], prev = [];
        for (var i = 0; i < n; i++) empty[i] = state[i] = prev[i] = 0;
        var total = memo[0][signature(empty, empty)];
        var pick = Math.floor(Math.random() * total);
        document.write("solution " + pick.toLocaleString('en-US') + 
            " from a total of " + total.toLocaleString('en-US') + "
    "); for (var row = 1; row <= n; row++) { var options = find_options(state, prev); for (var i in options) { var state_copy = state.slice(); for (var j in state_copy) state_copy[j] += options[i][j]; var sig = signature(state_copy, options[i]); var solutions = memo[row][sig]; if (pick < solutions) { matrix.push(options[i].slice()); prev = options[i].slice(); state = state_copy.slice(); break; } else pick -= solutions; } } return matrix; function find_options(state, prev) { var options = []; for (var i = 0; i < n - 2; i++) { if (state[i] == 2 || prev[i] == 1) continue; for (var j = i + 2; j < n; j++) { if (state[j] == 2 || prev[j] == 1) continue; var option = empty.slice(); ++option[i]; ++option[j]; options.push(option); } } return options; } } var size = 10; var memo = memoize(size); var matrix = random_matrix(size, memo); for (var row in matrix) document.write(matrix[row] + "
    ");

    The code snippet below shows the dictionary of signatures and solution counts for a matrix of size 10x10. I've used a slightly different signature format from the explanation above: the zones are delimited by a '2' instead of a plus sign, and a column which has a one in the previous row is marked with a '3' instead of a '2'. This shows how the keys could be stored in a file as integers with 2×N bits (padded with 2's).

    function signature(state, prev) {
        var zones = [], zone = [];
        for (var i = 0; i < state.length; i++) {
            if (state[i] == 2) {
                if (zone.length) zones.push(mirror(zone));
                zone = [];
            }
            else if (prev[i]) zone.push(3);
            else zone.push(state[i]);
        }
        if (zone.length) zones.push(mirror(zone));
        zones.sort(function(a,b) {return a.length - b.length || a - b;});
        return zones.length ? zones.join("2") : "2";
    
        function mirror(zone) {
            var ltr = zone.join('');
            zone.reverse();
            var rtl = zone.join('');
            return (ltr < rtl) ? ltr : rtl;
        }
    }
    
    function memoize(n) {
        var memo = [], empty = [];
        for (var i = 0; i <= n; i++) memo[i] = [];
        for (var i = 0; i < n; i++) empty[i] = 0;
        memo[0][signature(empty, empty)] = next_row(empty, empty, 1);
        return memo;
    
        function next_row(state, prev, row) {
            if (row > n) return 1;
            var solutions = 0;
            for (var i = 0; i < n - 2; i++) {
                if (state[i] == 2 || prev[i] == 1) continue;
                for (var j = i + 2; j < n; j++) {
                    if (state[j] == 2 || prev[j] == 1) continue;
                    var s = state.slice(), p = empty.slice();
                    ++s[i]; ++s[j]; ++p[i]; ++p[j];
                    var sig = signature(s, p);
                    var sol = memo[row][sig];
                    if (sol == undefined) 
                        memo[row][sig] = sol = next_row(s, p, row + 1);
                    solutions += sol;
                }
            }
            return solutions;
        }
    }
    
    var memo = memoize(10);
    for (var i in memo) {
        document.write("row " + i + ":
    "); for (var j in memo[i]) { document.write(""" + j + "": " + memo[i][j] + "
    "); } }

提交回复
热议问题