问题
I am looking a function that return me all the unordered combination of a vector. eg
x<-c('red','blue','black')
uncomb(x)
[1]'red'
[2]'blue'
[3]'black'
[4]'red','blue'
[5]'blue','black'
[6]'red','black'
[7]'red','blue','black'
I guess that there is a function in some library that do this, but in can't find it. I am trying with permutations of gtool but it is not the function i am looking for.
回答1:
You could apply a sequence the length of x over the m argument of the combn() function.
x <- c("red", "blue", "black")
do.call(c, lapply(seq_along(x), combn, x = x, simplify = FALSE))
# [[1]]
# [1] "red"
#
# [[2]]
# [1] "blue"
#
# [[3]]
# [1] "black"
#
# [[4]]
# [1] "red" "blue"
#
# [[5]]
# [1] "red" "black"
#
# [[6]]
# [1] "blue" "black"
#
# [[7]]
# [1] "red" "blue" "black"
If you prefer a matrix result, then you can apply stringi::stri_list2matrix() to the list above.
stringi::stri_list2matrix(
do.call(c, lapply(seq_along(x), combn, x = x, simplify = FALSE)),
byrow = TRUE
)
# [,1] [,2] [,3]
# [1,] "red" NA NA
# [2,] "blue" NA NA
# [3,] "black" NA NA
# [4,] "red" "blue" NA
# [5,] "red" "black" NA
# [6,] "blue" "black" NA
# [7,] "red" "blue" "black"
回答2:
I was re-directed here from List All Combinations With combn as this was one of the dupe targets. This is an old question and the answer provided by @RichScriven is very nice, but I wanted to give the community a few more options that are arguably more natural and more efficient (the last two).
We first note that the output is very similar to the Power Set. Calling powerSet from the rje package, we see that indeed our output matches every element from the power set except the first element which is equivalent to the Empty Set:
x <- c("red", "blue", "black")
rje::powerSet(x)
[[1]]
character(0) ## empty set equivalent
[[2]]
[1] "red"
[[3]]
[1] "blue"
[[4]]
[1] "red" "blue"
[[5]]
[1] "black"
[[6]]
[1] "red" "black"
[[7]]
[1] "blue" "black"
[[8]]
[1] "red" "blue" "black"
If you don't want the first element, you can easily add a [-1] to the end of your function call like so : rje::powerSet(x)[-1].
The next two solutions are from the newer packages arrangements and RcppAlgos (I am the author), that will offer the user great gains in efficiency. Both of these packages are capable of generating combinations of Multisets.
Why is this important?
It can be shown that there is a one-to-one mapping from the power set of A to all combinations of the multiset c(rep(emptyElement, length(A)), A) choose length(A), where emptyElement is a representation of the empty set (like zero or a blank). With this in mind, observe:
library(arrangements)
combinations(x = c("",x), k = 3, freq = c(2, rep(1, 3)))
[,1] [,2] [,3]
[1,] "" "" "red"
[2,] "" "" "blue"
[3,] "" "" "black"
[4,] "" "red" "blue"
[5,] "" "red" "black"
[6,] "" "blue" "black"
[7,] "red" "blue" "black"
library(RcppAlgos)
comboGeneral(c("",x), 3, freqs = c(2, rep(1, 3)))
[,1] [,2] [,3]
[1,] "" "" "black"
[2,] "" "" "blue"
[3,] "" "" "red"
[4,] "" "black" "blue"
[5,] "" "black" "red"
[6,] "" "blue" "red"
[7,] "black" "blue" "red"
If you don't like dealing with blank elements and/or matrices, you can also return a list making use of lapply.
lapply(seq_along(x), comboGeneral, v = x)
[[1]]
[,1]
[1,] "black"
[2,] "blue"
[3,] "red"
[[2]]
[,1] [,2]
[1,] "black" "blue"
[2,] "black" "red"
[3,] "blue" "red"
[[3]]
[,1] [,2] [,3]
[1,] "black" "blue" "red"
lapply(seq_along(x), combinations, n = length(x), x = x)
[[1]]
[,1]
[1,] "red"
[2,] "blue"
[3,] "black"
[[2]]
[,1] [,2]
[1,] "red" "blue"
[2,] "red" "black"
[3,] "blue" "black"
[[3]]
[,1] [,2] [,3]
[1,] "red" "blue" "black"
Now we show that the last two methods are much more efficient (N.B. I removed do.call(c, and simplify = FALSE from the answer provided by @RichSciven in order to compare generation of similar outputs. I also included rje::powerSet for good measure):
set.seed(8128)
bigX <- sort(sample(10^6, 20)) ## With this as an input, we will get 2^20 - 1 results.. i.e. 1,048,575
library(microbenchmark)
microbenchmark(powSetRje = powerSet(bigX),
powSetRich = lapply(seq_along(bigX), combn, x = bigX),
powSetArrange = lapply(seq_along(bigX), function(y) combinations(x = bigX, k = y)),
powSetAlgos = lapply(seq_along(bigX), comboGeneral, v = bigX),
unit = "relative")
Unit: relative
expr min lq mean median uq max neval
powSetRje 52.992681 15.055038 11.091203 13.586952 8.860661 7.347368 100
powSetRich 58.679666 14.864760 10.914700 13.198179 8.675812 6.017437 100
powSetArrange 1.042766 1.062227 1.071404 1.098491 1.126971 1.044827 100
powSetAlgos 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 100
Even further, arrangements comes equipped with an argument called type which lets the user choose a particular format for their output. One of those is type = "l" for list. It is similar to setting simplify = FALSE in combn and allows us to obtain output like that of powerSet. Observe:
do.call(c, lapply(seq_along(x), combinations, n = length(x), x = x, type = "l"))
[[1]]
[1] "red"
[[2]]
[1] "blue"
[[3]]
[1] "black"
[[4]]
[1] "red" "blue"
[[5]]
[1] "red" "black"
[[6]]
[1] "blue" "black"
[[7]]
[1] "red" "blue" "black"
And the benchmarks:
microbenchmark(powSetRje = powerSet(bigX)[-1],
powSetRich = do.call(c, lapply(seq_along(bigX), combn, x = bigX, simplify = FALSE)),
powSetArrange = do.call(c, lapply(seq_along(bigX), combinations, n = length(bigX), x = bigX, type = "l")),
times = 15, unit = "relative")
Unit: relative
expr min lq mean median uq max neval
powSetRje 4.925559 4.433365 4.013872 3.893674 3.819344 3.609616 15
powSetRich 5.732216 4.975508 4.542482 4.564668 4.288592 4.003765 15
powSetArrange 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 15
来源:https://stackoverflow.com/questions/27953588/unordered-combinations-in-r