Grouped correlation with dplyr (works only on console)

前端 未结 1 1367
长发绾君心
长发绾君心 2020-12-20 17:25

I\'m trying to use dplyr to calculate grouped correlations, but something is clearly wrong since the code below works only in the console:

相关标签:
1条回答
  • 2020-12-20 18:16

    What you experience is related to having both plyr and dplyr loaded at the same time. Since both packages have summarize functions, there can be conflicts if you don't specify explicitly which package you want to use. For the example data, this means:

    require(dplyr)
    set.seed(123)
    xx = data.frame(group = rep(1:4, 100), a = rnorm(400) , b = rnorm(400))
    

    Using dplyr as intended:

    gp = group_by(xx, group)
    dplyr::summarize(gp, cor(a, b))
    #Source: local data frame [4 x 2]
    #
    #  group   cor(a, b)
    #1     1 -0.02073084
    #2     2  0.12803353
    #3     3  0.06236264
    #4     4 -0.06181904
    

    Or using plyr

    gp = group_by(xx, group)
    plyr::summarize(gp, cor(a, b))
    #   cor(a, b)
    #1 0.02739193
    

    So either avoid loading both packages or specify the package by using package::function.

    0 讨论(0)
提交回复
热议问题