问题
I am trying to use ddply
and summarise
together from the plyr
package but am having difficulty parsing through column names that keep changing...In my example i would like something that would parse in X1 programatically rather than hard coding in X1 into the ddply function.
setting up an example
require(xts)
require(plyr)
require(reshape2)
require(lubridate)
t <- xts(matrix(rnorm(10000),ncol=10), Sys.Date()-1000:1)
t.df <- data.frame(coredata(t))
t.df <- cbind(day=wday(index(t), label=TRUE, abbr=TRUE), t.df)
t.df.l <- melt(t.df, id.vars=c("day",colnames(t.df)[2]), measure.vars=colnames(t.df)[3:ncol(t.df)])
This is the bit im am struggling with....
cor.vars <- ddply(t.df.l, c("day","variable"), summarise, cor(X1, value))
i do not want to use the term X1 and would like to use something like
cor.vars <- ddply(t.df.l, c("day","variable"), summarise, cor(colnames(t.df)[2], value))
but that comes up with the error: Error in cor(colnames(t.df)[2], value) : 'x' must be numeric
I also tried various other combos that parse in the vector values for the x argument in cor...but for some reason none of them seem to work...
any ideas?
回答1:
Although this is probably not the intended usage for summarize
and there must be much better approaches to your problem, the direct answer to your question is to use get
:
ddply(t.df.l, c("day","variable"), summarise, cor(get(colnames(t.df)[2]), value))
Edit: here is for example one approach that is in my opinion better suited to your problem:
ddply(t.df.l, c("day", "variable"), function(x)cor(x["X1"], x["value"]))
Above, "X1"
can be also replaced by 2
or the name of a variable holding "X1"
, etc. It depends how you want to programmatically access the column.
来源:https://stackoverflow.com/questions/12724745/ddply-summarise-function-column-name-input