问题
The classic Fisher Yates looks something like this:
void shuffle1(std::vector<int>& vec)
{
int n = vec.size();
for (int i = n - 1; i > 0; --i)
{
std::swap(vec[i], vec[rand() % (i + 1)]);
}
}
Yesterday, I implemented the iteration "backwards" by mistake:
void shuffle2(std::vector<int>& vec)
{
int n = vec.size();
for (int i = 1; i < n; ++i)
{
std::swap(vec[i], vec[rand() % (i + 1)]);
}
}
Is this version in any way worse (or better) than the first? Does it skew the resulting probabilities?
回答1:
Yes it's even distribution assuming rand()
is. We will prove this by showing that each input can generate each permutation with equal probability.
N=2 can be easily proven. We will draw it as a tree where the the children represent each string you can get by inserting the character after comma into the leftmost string.
0,1 //input where 0,1 represent indices
01 10 //output. Represents permutations of 01. It is clear that each one has equal probability
For N, we will have every permutations for N-1, and randomly swapping the last character for N
(N-1 0th permutation),N ..... (N-1 Ith permutation),N ________________________
/ \ / \ \
0th permutation of N 1st permutation.... (I*N)th permutation ((I*N)+1)th permutation .... (I*N)+(I-1)th permutation
This shitty induction should lead you to it having even distribution.
Example:
N=2:
0,1
01 10 // these are the permutations. Each one has equal probability
N=3:
0,1|2 // the | is used to separate characters that we will insert later
01,2 10,2 // 01, 10 are permutations from N-1, 2 is the new value
210 021 012 201 120 102 // these are the permutations, still equal probability
N=4: (curved to aid reading)
0,1|23
01,2|3 10,2|3
012,3 021,3 210,3 102,3 120,3 201,3
0123 0132 0321 3230 2013 2031 2310 3012
0213 0231 0312 3210 1203 1230 1302 3201
2103 2130 2301 3102 1023 1032 1320 3021
etc
回答2:
Looks OK to me (assuming rand() % N is unbiased, which it isn't). It seems it should be possible to demonstrate that every permutation of input is produced by exactly 1 sequence of random choices, where each random choice is balanced.
Compare this with a faulty implementation, such as
for (int i = 0; i < v.size(); ++i) {
swap(v[i], v[rand() % v.size()]);
}
Here you can see that there are nn equally likely ways to produce n! permutations, and since nn is not evenly divisible by n! where n > 2, some of those permutations have to be produced more often than others.
来源:https://stackoverflow.com/questions/8861568/fisher-yates-variation