问题
I have a long data set I would like to make wide and I\'m curious if there is a way to do this all in one step using the reshape2 or tidyr packages in R.
The data frame df
looks like this:
id type transactions amount
20 income 20 100
20 expense 25 95
30 income 50 300
30 expense 45 250
I\'d like to get to this:
id income_transactions expense_transactions income_amount expense_amount
20 20 25 100 95
30 50 45 300 250
I know I can get part of the way there with reshape2 via for example:
dcast(df, id ~ type, value.var=\"transactions\")
But is there a way to reshape the entire df in one shot addressing both the \"transactions\" and \"amount\" variables at once? And ideally with new more appropriate column names?
回答1:
In "reshape2", you can use recast
(though in my experience, this isn't a widely known function).
library(reshape2)
recast(mydf, id ~ variable + type, id.var = c("id", "type"))
# id transactions_expense transactions_income amount_expense amount_income
# 1 20 25 20 95 100
# 2 30 45 50 250 300
You can also use base R's reshape
:
reshape(mydf, direction = "wide", idvar = "id", timevar = "type")
# id transactions.income amount.income transactions.expense amount.expense
# 1 20 20 100 25 95
# 3 30 50 300 45 250
Or, you can melt
and dcast
, like this (here with "data.table"):
library(data.table)
library(reshape2)
dcast.data.table(melt(as.data.table(mydf), id.vars = c("id", "type")),
id ~ variable + type, value.var = "value")
# id transactions_expense transactions_income amount_expense amount_income
# 1: 20 25 20 95 100
# 2: 30 45 50 250 300
In later versions of dcast.data.table
from "data.table" (1.9.8) you will be able to do this directly. If I understand correctly, what @Arun is trying to implement would be doing the reshaping without first having to melt
the data, which is what happens presently with recast
, which is essentially a wrapper for a melt
+ dcast
sequence of operations.
And, for thoroughness, here's the tidyr
approach:
library(dplyr)
library(tidyr)
mydf %>%
gather(var, val, transactions:amount) %>%
unite(var2, type, var) %>%
spread(var2, val)
# id expense_amount expense_transactions income_amount income_transactions
# 1 20 95 25 100 20
# 2 30 250 45 300 50
回答2:
With data.table v1.9.6+, we can cast multiple value.var
columns simultaneously (and also use multiple aggregation functions in fun.aggregate
). Please see ?dcast
for more and also the examples section.
require(data.table) # v1.9.6+
dcast(dt, id ~ type, value.var=names(dt)[3:4])
# id transactions_expense transactions_income amount_expense amount_income
# 1: 20 25 20 95 100
# 2: 30 45 50 250 300
来源:https://stackoverflow.com/questions/27247078/reshape-multiple-values-at-once