问题
I have a vector fruit
with three entries Peach, Plum, Pear
. I would like to find each unique pairing in fruit
and create a new, two column data.frame (e.g. df.new below). How might I do this in r for an even larger data.set? expand.grid
results in pear-plum
and plum-pear
which are not unique pairings, or not the ones I am seeking. Any suggestions?
fruit <- c("Peach", "Plum", "Pear")
fruit1 <- c("Peach", "Peach", "Plum")
fruit2 <- c("Plum", "Pear", "Pear")
df.new <- data.frame(fruit1, fruit2)
#df.new
fruit1 fruit2
1 Peach Plum
2 Peach Pear
3 Plum Pear
# attempt
fruit.y <- fruit
df.expand <- expand.grid(fruit,fruit.y)
回答1:
Using your initial strategy, you can still try expand grid:
fruit_df <- expand.grid(fruit,fruit)
Then sort each row by fruit and delete the duplicates:
fruit_df2 <- as.data.frame(unique(t(apply(fruit_df, 1, function(x) sort(x)))))
V1 V2
1 Peach Peach
2 Peach Plum
3 Peach Pear
4 Plum Plum
5 Pear Plum
6 Pear Pear
Another strategy is to generate all combination of pairs in fruit
, try:
combn(fruit,2)
[,1] [,2] [,3]
[1,] "Peach" "Peach" "Plum"
[2,] "Plum" "Pear" "Pear"
Or to make your output as a data frame, transpose the results and recast:
as.data.frame(t(combn(fruit,2)))
Note that using combn
you will not get the plum-plum
.
来源:https://stackoverflow.com/questions/23024059/find-unique-pairings-of-entries-in-a-character-vector