using multiple variables in plyr

坚强是说给别人听的谎言 提交于 2019-12-24 03:53:07

问题


I am trying to use plyr but have difficulties in using several variables. Here an example.

    df <- read.table(header=TRUE, text="
    Firm Foreign SME Turnover
A1       N   Y      200
A2       N   N     1000
A3       Y   Y      100
A1       N   N      500
A2       Y   Y      200
A3       Y   Y     1000
A1       Y   N      200
A2       N   N     1000
A2       N   Y      100
A2       N   Y      200  ")

I am trying to create a table which summarize the Turnover on the two variables. Basically combining the following codes

t1 <- ddply(df, c('Firm', 'Foreign'), summarise, 
        BudgetForeign    = sum(Turnover, na.rm = TRUE))

t2 <- ddply(df, c('Firm', 'SME'), summarise, 
        BudgetSME    = sum(Turnover, na.rm = TRUE))

with following results

res <- read.table(header=TRUE, text="
Firm          A1   A2   A3  
BudgetForeign 200  200 1100
BudgetSME     200  500 1100")
res

How can I achieve this without doing multiple operations and subset and combine afterwards ?

Thanks in advance.


回答1:


I think you only want the values where Foreign or SME are 'Y'... if that's the case. I would use melt and dcast from the reshape2 package rather than plyr.

df.m <- melt(df, id.var=c('Firm', 'Turnover'))

dcast(df.m[df.m$value=='Y',], variable ~ Firm, value.var='Turnover', fun.aggregate=sum)

  variable  A1  A2   A3
1  Foreign 200 200 1100
2      SME 200 500 1100

If you want to see the differences between Y and N also you can add them to the formula in dcast:

> dcast(df.m, variable + value ~ Firm, value.var='Turnover', fun.aggregate=sum)
  variable value  A1   A2   A3
1  Foreign     N 700 2300    0
2  Foreign     Y 200  200 1100
3      SME     N 700 2000    0
4      SME     Y 200  500 1100
> 



回答2:


Thanks Justin. From your answer, the following code should solve my problem.

library(reshape2)
df.m <- melt(df, id.var=c('Firm', 'Turnover'))

x <- dcast(df.m, variable + value ~ Firm, value.var='Turnover', fun.aggregate=sum)


res <- rbind(
         BudgetForeign = subset(x, variable == 'Foreign' & value == 'Y'),
         BudgetSME = subset(x, variable == 'SME' & value == 'Y')
         )
res


来源:https://stackoverflow.com/questions/11990830/using-multiple-variables-in-plyr

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!