问题
I have 33 students I want to sort into groups of 6 (or as close as possible) on 5 different occasions. So I assign a number between 1 and 6 to the students on different occassions.
I've managed the following:
studentlist <- data.frame(seq(1:33))
studentlist$Occassion1 <- sample(factor(rep(1:6, length.out=nrow(studentlist)),
labels=paste0(1:6)))
studentlist$Occassion2 <- sample(factor(rep(1:6, length.out=nrow(studentlist)),
labels=paste0(1:6)))
studentlist$Occassion3 <- sample(factor(rep(1:6, length.out=nrow(studentlist)),
labels=paste0(1:6)))
studentlist$Occassion4 <- sample(factor(rep(1:6, length.out=nrow(studentlist)),
labels=paste0(1:6)))
studentlist$Occassion5 <- sample(factor(rep(1:6, length.out=nrow(studentlist)),
labels=paste0(1:6)))
This seems to work. As I've understood, I ask for each row a random sample between 1 and 6.
How does the length.out argument from rep() interact with sample()?
When I run a frequency table to check the sizes of the groups, I find the following:
numb=1,2,3,4,5,6. size=6,6,6,5,5,5.
I tried asking for 7 groups instead, and got group sizes of:
numb=1,2,3,4,5,6,7. size=5,5,5,5,5,4,4.
Why are they organised in this decreasing fashion?
回答1:
You have this specific pattern because of how the rep function works with length.out. If you want to create groups of 6,
rep(1:6, length.out = 33)
will first repeat the numbers 1 to 6 5 times (generating 30 values) and complete the 3 missing ones with values 1, 2 and 3. So you will always have one more student in the groups 1, 2 and 3.
来源:https://stackoverflow.com/questions/58660903/r-assigning-students-to-equal-groups-with-random-sampling-understanding-rep