Reshape data from long to wide, with time in new wide variable name

后端 未结 4 941
时光说笑
时光说笑 2020-12-15 02:04

I have a data frame that I would like to merge from long to wide format, but I would like to have the time embedded into the variable name in the wide format. Here is an ex

相关标签:
4条回答
  • 2020-12-15 02:17

    I had to do it in two reshape steps. The row headings may not be exactly what you needed, but can be renamed easily.

    id <- as.numeric(rep(1, 16))
    time <- rep(c(5,10,15,20), 4)
    varname <- c(rep("var1",4), rep("var2", 4), rep("var3", 4), rep("var4", 4))
    value <- rnorm(16)
    tmpdata <- as.data.frame(cbind(id, time, varname, value))
    
    first <- reshape(tmpdata, timevar="time", idvar=c("id", "varname"), direction="wide")
    second <- reshape(first, timevar="varname", idvar="id", direction="wide") 
    

    And the output:

    > tmpdata
       id time varname               value
    1   1    5    var1  -0.231227494628982
    2   1   10    var1   -1.80887236653438
    3   1   15    var1  -0.443229294431553
    4   1   20    var1    1.33719337048763
    5   1    5    var2   0.673109282347586
    6   1   10    var2   -0.42142267953938
    7   1   15    var2   0.874367622725874
    8   1   20    var2   -1.19917678039462
    9   1    5    var3    1.13495606258399
    10  1   10    var3 -0.0779385346672042
    11  1   15    var3  -0.126775240288037
    12  1   20    var3  -0.760739300144526
    13  1    5    var4   -1.94626587907069
    14  1   10    var4    1.25643195699455
    15  1   15    var4   -0.50986941213717
    16  1   20    var4   -1.01324846239812
    > first
       id varname            value.5            value.10           value.15
    1   1    var1 -0.231227494628982   -1.80887236653438 -0.443229294431553
    5   1    var2  0.673109282347586   -0.42142267953938  0.874367622725874
    9   1    var3   1.13495606258399 -0.0779385346672042 -0.126775240288037
    13  1    var4  -1.94626587907069    1.25643195699455  -0.50986941213717
                 value.20
    1    1.33719337048763
    5   -1.19917678039462
    9  -0.760739300144526
    13  -1.01324846239812
    > second
      id       value.5.var1     value.10.var1      value.15.var1    value.20.var1
    1  1 -0.231227494628982 -1.80887236653438 -0.443229294431553 1.33719337048763
           value.5.var2     value.10.var2     value.15.var2     value.20.var2
    1 0.673109282347586 -0.42142267953938 0.874367622725874 -1.19917678039462
          value.5.var3       value.10.var3      value.15.var3      value.20.var3
    1 1.13495606258399 -0.0779385346672042 -0.126775240288037 -0.760739300144526
           value.5.var4    value.10.var4     value.15.var4     value.20.var4
    1 -1.94626587907069 1.25643195699455 -0.50986941213717 -1.01324846239812
    
    0 讨论(0)
  • 2020-12-15 02:24

    Why not just paste varname and time together before you reshape?

    0 讨论(0)
  • 2020-12-15 02:29

    I gave up on the old reshape() command 2 years ago (not Hadley's). It seems figuring that damn thing out each time was actually harder than just doing it the 'hard' way, which is much more flexible.

    Your data in your example are all nicely sorted. You might have to sort your real data by var name and time first.

    (renamed your tmpdata to tmp, made value numeric)

    y <- lapply(split(tmp, tmp$id), function(x) x$value)
    df <- data.frame(unique(tmp$id,), do.call(rbind,y))
    names(df) <- c('id', as.character(tmp$time:tmp$var))
    
    0 讨论(0)
  • 2020-12-15 02:32

    This is trivial with the reshape package:

    library(reshape)
    cast(tmpdata, ... ~ varname + time)
    
    0 讨论(0)
提交回复
热议问题