问题
Given the table below:
X =
col1 col2 col3
row1 "A" "A" "1.0"
row2 "A" "B" "0.9"
row3 "A" "C" "0.4"
row4 "B" "A" "0.9"
row5 "B" "B" "1.0"
row6 "B" "C" "0.2"
row7 "C" "A" "0.4"
row8 "C" "B" "0.2"
row9 "C" "C" "1.0"
Where col3 is a correlation measure between pairs of entities in col1 and col2.
How can I construct a matrix for which the column names are col1, the row names are col2, and the values in the cells of the matrix are populated by col3?
回答1:
Need some data to work with so I'll make some up.
# Make fake data
x <- c('A','B','C')
dat <- expand.grid(x, x)
dat$Var3 <- rnorm(9)
We can use base R to do this. I'm not very good with the 'reshape' function but you could do this. The column names would need to be cleaned up afterwards though
> reshape(dat, idvar = "Var1", timevar = "Var2", direction = "wide")
Var1 Var3.A Var3.B Var3.C
1 A -1.2442937 -0.01132871 -0.5693153
2 B -1.6044295 -1.34907504 1.6778866
3 C 0.5393472 -1.00637345 -0.7694940
Alternatively you could use the dcast
function from the reshape2 package. The output is a little cleaner I think.
> library(reshape2)
> dcast(dat, Var1 ~ Var2, value.var = "Var3")
Var1 A B C
1 A -1.2442937 -0.01132871 -0.5693153
2 B -1.6044295 -1.34907504 1.6778866
3 C 0.5393472 -1.00637345 -0.7694940
回答2:
df <- read.table(textConnection('col1 col2 col3
row1 "A" "A" "1.0"
row2 "A" "B" "0.9"
row3 "A" "C" "0.4"
row4 "B" "A" "0.9"
row5 "B" "B" "1.0"
row6 "B" "C" "0.2"
row7 "C" "A" "0.4"
row8 "C" "B" "0.2"
row9 "C" "C" "1.0"'), header=T)
## fetch row/column indices
rows <- match(df$col1, LETTERS)
cols <- match(df$col2, LETTERS)
## create matrix
m <- matrix(0, nrow=max(rows), ncol=max(cols))
## fill matrix
m[cbind(rows, cols)] <- df$col3
m
# [,1] [,2] [,3]
#[1,] 1.0 0.9 0.4
#[2,] 0.9 1.0 0.2
#[3,] 0.4 0.2 1.0
来源:https://stackoverflow.com/questions/18560962/r-how-to-combine-2-pairwise-vectors-into-a-matrix