Suppose (to simplify) I have a table containing some control vs. treatment data:
Which, Color, Response, Count
Control, Red, 2, 10
Control, Blue, 3, 20
Treatment
To add to the options (many years later)....
The typical approach in base R would involve the reshape
function (which is generally unpopular because of the multitude of arguments that take time to master). It's a pretty efficient function for smaller datasets, but doesn't always scale well.
reshape(mydf, direction = "wide", idvar = "Color", timevar = "Which")
# Color Response.Control Count.Control Response.Treatment Count.Treatment
# 1 Red 2 10 1 14
# 2 Blue 3 20 4 21
Already covered are cast
/dcast
from the "reshape" and "reshape2" (and now, dcast.data.table
from "data.table", especially useful when you have large datasets). But also from the Hadleyverse, there's "tidyr", which works nicely with the "dplyr" package:
library(tidyr)
library(dplyr)
mydf %>%
gather(var, val, Response:Count) %>% ## make a long dataframe
unite(RN, var, Which) %>% ## combine the var and Which columns
spread(RN, val) ## make the results wide
# Color Count_Control Count_Treatment Response_Control Response_Treatment
# 1 Blue 20 21 3 4
# 2 Red 10 14 2 1
Also to note would be that in a forthcoming version of "data.table", the dcast.data.table
function should be able to handle this without having to first melt
your data.
The data.table
implementation of dcast
allows you to convert multiple columns to a wide format without melting it first, as follows:
library(data.table)
dcast(as.data.table(mydf), Color ~ Which, value.var = c("Response", "Count"))
# Color Response_Control Response_Treatment Count_Control Count_Treatment
# 1: Blue 3 4 20 21
# 2: Red 2 1 10 14