Is it possible to shuffle a 2D matrix while preserving row AND column frequencies?

后端 未结 4 1004
孤城傲影
孤城傲影 2020-12-19 12:40

Suppose I have a 2D array like the following:

GACTG
AGATA
TCCGA

Each array element is taken from a small finite set (in my case, DNA nucleo

4条回答
  •  别那么骄傲
    2020-12-19 13:27

    Edit: oops missed the last paragraph of OP's question, let me rephrase.

    To digress briefly, the question you linked to had quite a hilarious discussion about the "level" of randomness for the selected solution, allow me to paraphrase:

    "...I really require matrices that are as random as possible..."

    "...The algorithm, as implemented in the code, is quite random..."

    "...if you choose this method, a different way to improve the randomness is to repeat the randomization process several times (a random number of times)..."

    None of these comments make any sort of sense, there is no such thing as "more" random, this is all exactly like this lovely Daily WTF entry. That said, the last quote is almost onto something. It's well known that if you simulate a Markov chain, like that random swapping algorithm, for long enough you will eventually start generating samples from the steady state distribution. Just exactly what that distribution looks like, who knows...

    Anyway, depending on your objectives you may not really care what this distribution looks like as long as it contains enough elements. So some sort of swapping algorithm might be useful, but I really would not expect this to be easy since the problem is NP-Complete (more general than Sudoku).

    With that in mind, you could consider solving your problem any approach that works for solving Sudoku, if you are in Acadamia I would suggest getting a copy of IBM CPLEX 12 which is free for academic use. You can code up a Sudoku-like solver in their CP language (OPL) and as the integer linear program solver to generate solutions for you. I think they even have example code for solving Sudoku you can borrow from.

    Here's the only truly random and unbiased way I can think of to sample from such matrices: First get CPLEX to find all N solutions to the given Sudoku problem. After you have this set of N solutions, draw a random number between 1 and N and use that solution, if you want another one, draw another number. Since generating all solutions might be a bit slow, you could approximate something like this by telling the solver to stop after a certain number of solutions or time elapsed and only sample from that set.

提交回复
热议问题