Reshape Data Long to Wide - understanding reshape parameters

一曲冷凌霜 提交于 2019-11-27 15:48:09

You can use the function dcast from package reshape2. It's easier to understand. The left side of the formula is the one that stays long, while the right side is the one that goes wide.

The fun.aggregate is the function to apply in case that there is more than 1 number per case. If you're sure you don't have repeated cases, you can use mean or sum

dcast(data, formula= dogid + home + school ~ month + year + trainingtype,
value.var = 'timeincomp',
fun.aggregate = sum)

I hope it works:

  dogid home school 1_2014_1 2_2014_1 12_2015_2
1 12345    1      1      340      360         0
2 31323    7      3      500      520       440

In this case, using base reshape, you essentially want an interaction() of the three time variables to define your wide variables, so:

idvars  <- c("dogid","home","school")
grpvars <- c("year","month","trainingtype")
outvar  <- "timeincomp"
time    <- interaction(dat[grpvars])

reshape(
  cbind(dat[c(idvars,outvar)],time),
  idvar=idvars,
  timevar="time",
  direction="wide"
)

#  dogid home school timeincomp.2014.1.1 timeincomp.2014.2.1 timeincomp.2015.12.2
#1 12345    1      1                 340                 360                   NA
#3 31323    7      3                 500                 520                  440

You can do the same thing using the new replacement for reshape2, tidyr:

library(tidyr)
library(dplyr)
data %>% unite(newcol, c(year, month, trainingtype)) %>%
         spread(newcol, timeincomp)

  dogid home school 2014_1_1 2014_2_1 2015_12_2
1 12345    1      1      340      360        NA
2 31323    7      3      500      520       440

First, we unite the year, month and trainingtype columns into a new column called newcol, then we spread the data with timeincomp as our value variable.

The NA is there as we have no value, you can give it one by changing fill = NA in the spread function.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!