问题
If I have a vector of letters:
> all <- letters
> all
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z"
and then I define a reference sample from letters as follows:
> refSample <- c("j","l","m","s")
in which the spacing between elements is 2 (1st to 2nd), 1 (2nd to 3rd) and 6 (3rd to 4th), how can I then select n samples from all that have identical, non-wrap-around spacing between its elements to refSample? For example, "a","c","d","j" and "q" "s" "t" "z" would be valid samples, but "a","c","d","k" and "r" "t" "u" "a" would not. The former has an index difference of 7 (rather than 6) between the 3rd and last element, whereas the latter has the correct spacing but wraps around.
Second, how can I parameterise this, so that whatever refSample is used, I can use the spacing of that as a template?
回答1:
Here's a simple way --
all <- letters
refSample <- c("j","l","m","s")
pick_matches <- function(n, ref, full) {
iref <- match(ref,full)
spaces <- diff(iref)
tot_space <- sum(spaces)
max_start <- length(full) - tot_space
starts <- sample(1:max_start, n, replace = TRUE)
return( sapply( starts, function(s) full[ cumsum(c(s, spaces)) ] ) )
}
> set.seed(1)
> pick_matches(5, refSample, all) # each COLUMN is a desired sample vector
[,1] [,2] [,3] [,4] [,5]
[1,] "e" "g" "j" "p" "d"
[2,] "g" "i" "l" "r" "f"
[3,] "h" "j" "m" "s" "g"
[4,] "n" "p" "s" "y" "m"
来源:https://stackoverflow.com/questions/10438705/using-a-sample-list-as-a-template-for-sampling-from-a-larger-list-without-wrapar