问题
I have the following data frame:
# my_data
id cg
1 a
2 b
3 a
3 b
4 b
4 c
5 b
5 c
5 d
6 d
I would like to compute the covariance of the values of cg
. I believe I can obtain it by using cov()
on the following matrix, where every cell counts the number of co-occurrences between two values of cg
.
# my_matrix
cg a b c d
a 2 1 0 0
b 1 4 2 1
c 0 2 2 1
d 0 1 1 2
What is the quickest way to go from my_data
to my_matrix
? Please be aware that cg
contains more than 700 unique values.
If there is a better way to generate the covariance matrix, I am also interested in that.
Here is the code to generate my_data
:
my_data <- structure(list(id = c(1L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 5L, 6L),
cg = c("a", "b", "a", "b", "b", "c", "b", "c", "d", "d")),
.Names = c("id", "cg"),
class = "data.frame", row.names = c(NA, -10L))
回答1:
We can use crossprod
with table
crossprod(table(my_data))
# cg
#cg a b c d
# a 2 1 0 0
# b 1 4 2 1
# c 0 2 2 1
# d 0 1 1 2
来源:https://stackoverflow.com/questions/43631179/compute-covariance-matrix-from-list-of-occurrences