can the value.var in dcast be a list or have multiple value variables?

前端 未结 3 1124
醉酒成梦
醉酒成梦 2020-12-01 06:16

In the help files for dcast.data.table, there is a note stating that a new feature has been implemented: \"dcast.data.table allows value.var column to be of typ

3条回答
  •  盖世英雄少女心
    2020-12-01 06:46

    Update

    Apparently, the fix was much easier...


    Technically, your statement that "apparently there is no such feature" isn't quite correct. There is such a feature in the recast function (which sort of hides the melting and casting process), but it seems like Hadley forgot to finish the function or something: the function returns a list of the relevant parts of your operation.

    Here's a minimal example...

    Some sample data:

    set.seed(1)
    mydf <- data.frame(x1 = rep(1:3, each = 3),
                       x2 = rep(1:3, 3),
                       salt = sample(10, 9, TRUE),
                       sugar = sample(7, 9, TRUE))
    
    mydf
    #   x1 x2 salt sugar
    # 1  1  1    3     1
    # 2  1  2    4     2
    # 3  1  3    6     2
    # 4  2  1   10     5
    # 5  2  2    3     3
    # 6  2  3    9     6
    # 7  3  1   10     4
    # 8  3  2    7     6
    # 9  3  3    7     7
    

    The effect you seem to be trying to achieve:

    reshape(mydf, idvar='x1', timevar='x2', direction='wide')
    #   x1 salt.1 sugar.1 salt.2 sugar.2 salt.3 sugar.3
    # 1  1      3       1      4       2      6       2
    # 4  2     10       5      3       3      9       6
    # 7  3     10       4      7       6      7       7
    

    recast in action. (Note that the values are all what we would expect in the dimensions we would expect it.)

    library(reshape2)
    out <- recast(mydf, x1 ~ x2 + variable, measure.var = c("salt", "sugar"))
    ### recast(mydf, x1 ~ x2 + variable, id.var = c("x1", "x2"))
    out
    # $data
    #      [,1] [,2] [,3] [,4] [,5] [,6]
    # [1,]    3    1    4    2    6    2
    # [2,]   10    5    3    3    9    6
    # [3,]   10    4    7    6    7    7
    # 
    # $labels
    # $labels[[1]]
    #   x1
    # 1  1
    # 2  2
    # 3  3
    # 
    # $labels[[2]]
    #   x2 variable
    # 1  1     salt
    # 2  1    sugar
    # 3  2     salt
    # 4  2    sugar
    # 5  3     salt
    # 6  3    sugar
    

    I'm honestly not sure if this was an incomplete function, or if it is a helper function to another function.

    All of the information is there to be able to put the data back together again, making it easy to write a function like this:

    recast2 <- function(...) {
      inList <- recast(...)
      setNames(cbind(inList[[2]][[1]], inList[[1]]),
               c(names(inList[[2]][[1]]), 
                 do.call(paste, c(rev(inList[[2]][[2]]), sep = "_"))))
    }
    recast2(mydf, x1 ~ x2 + variable, measure.var = c("salt", "sugar"))
    #   x1 salt_1 sugar_1 salt_2 sugar_2 salt_3 sugar_3
    # 1  1      3       1      4       2      6       2
    # 2  2     10       5      3       3      9       6
    # 3  3     10       4      7       6      7       7
    

    Again, a possible advantage with the recast2 approach is the ability to aggregate as well as reshape in the same step.

提交回复
热议问题