Sending in Column Name to ddply from Function

人走茶凉 提交于 2019-11-26 23:13:03

问题


I'd like to be able to send in a column name to a call that I am making to ddply. An example ddply call:

ddply(myData, .(MyGrouping), summarise, count=sum(myColumnName))

If I have ddply wrapped within another function is it possible to wrap this so that I can pass in an arbitrary value as myColumnName to the calling function?


回答1:


There has got to be a better way. And I couldn't figure out how to make it work with summarise.

my.fun <- function(df, count.column) { 
  ddply(df, .(x), function(d) sum(d[[count.column]]))
}

dat <- data.frame(x=letters[1:2], y=1:10)

> my.fun(dat, 'y')
  x V1
1 a 25
2 b 30
> 



回答2:


As what @David Arenburg said, this question is pretty old. Today, either data.table or dplyr package can give you the same result with a much faster speed.

Here is the data.table version of the answer.

library(data.table)
my.fun <- function(myData, MyGrouping, myColumnName) { 
  setDT(myData)[, lapply(.SD, sum), by=MyGrouping, .SDcols=myColumnName]
}



回答3:


I guess I found a way that works with summarise. I'm not sure if I understand why, since I'm no expert in dealing with environments in R, but here's the solution:

> library(plyr)
> 
> 
> 
> ###########################
> # Creating test DataFrame #
> ###########################
> 
> x <- 1:15
> 
> set.seed(1)
> y <- letters[1:3][sample(1:3, 15, replace = T)]
> 
> df <- data.frame(x, y)
> 
> ### check df
> df
    x y
1   1 a
2   2 b
3   3 b
4   4 c
5   5 a
6   6 c
7   7 c
8   8 b
9   9 b
10 10 a
11 11 a
12 12 a
13 13 c
14 14 b
15 15 c
> 
> 
> #####################
> # auxiliar function #
> #####################
> evalString <- function(s) {
+ eval(parse(text = s), parent.frame())
+ }
> 
> 
> ### columnName input
> columnName <- 'x'
> 
> ### call with columnName as input
> xMeans <- ddply(df,
+                 'y',
+                 summarise,
+                 mean = mean(evalString(columnName)))
> 
> 
> ### regular call to ddply
> xMeans2 <- ddply(df,
+                 'y',
+                 summarise,
+                 mean = mean(x))
> 
> 
> ### Compare Results
> xMeans
  y mean
1 a  7.8
2 b  7.2
3 c  9.0
> xMeans2
  y mean
1 a  7.8
2 b  7.2
3 c  9.0
>   

EDIT: You can use the get function from the base package, as suggested here: ddply: how do I pass column names as parameters?

> xMeans3 <- ddply(df,
+                 'y',
+                 summarise,
+                 mean = mean(get(columnName)))
> 
> xMeans3
  y mean
1 a  7.8
2 b  7.2
3 c  9.0


来源:https://stackoverflow.com/questions/10178203/sending-in-column-name-to-ddply-from-function

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!