How do I select all unique combinations of two columns in an R data frame?

后端 未结 5 1766
一个人的身影
一个人的身影 2020-12-18 15:48

I have a correlation matrix that I put in a dataframe like so:

row | var1 | var2 | cor
1   | A    | B    | 0.6
2   | B    | A    | 0.6
3   | A    | C    | 0         


        
5条回答
  •  执笔经年
    2020-12-18 16:26

    A solution is to order var1 and var2 (the ordering is alphabetical) then use unique. I did this with data.table out of convenience, but it could be done with dplyr no problem.

    library(data.table)
    
    dt = data.table(var1 = c("A", "B", "A", "C"), var2 = c("B", "A", "C", "A"), cor = c(0.6 ,0.6, 0.4, 0.4))
    
    dt[, var1_alt := min(var1, var2), by = 1:nrow(dt)]
    dt[, var2_alt := max(var1, var2), by = 1:nrow(dt)]
    
    dt = unique(dt[, .(var1 = var1_alt, var2 = var2_alt, cor)])
    

提交回复
热议问题