Correlation between groups in R data.table

前端 未结 3 1554
南笙
南笙 2021-01-02 03:59

Is there a way of elegantly calculating the correlations between values if those values are stored by group in a single column of a data.table (other than converting the dat

3条回答
  •  盖世英雄少女心
    2021-01-02 04:16

    There is no simple way to do this with data.table. The first way you've provided:

    cor(dt["a"]$value, dt["b"]$value)
    

    Is probably the simplest.

    An alternative is to reshape your data.table from "long" format, to "wide" format:

    > dtw <- reshape(dt, timevar="group", idvar="id", direction="wide")
    > dtw
       id    value.a    value.b
    1:  1 -0.6264538  0.3295078
    2:  2  0.1836433 -0.8204684
    3:  3 -0.8356286  0.4874291
    4:  4  1.5952808  0.7383247
    > cor(dtw[,list(value.a, value.b)])
              value.a   value.b
    value.a 1.0000000 0.1556371
    value.b 0.1556371 1.0000000
    

    Update: If you're using data.table version >= 1.9.0, then you can use dcast.data.table instead which'll be much faster. Check this post for more info.

    dcast.data.table(dt, id ~ group)
    

提交回复
热议问题