understanding ddply error message

匿名 (未验证) 提交于 2019-12-03 02:56:01

问题:

I am trying to figure out why I am getting an error message when using ddply.

Example data:

data<-data.frame(area=rep(c("VA","OC","ES"),each=4),     sex=rep(c("Male","Female"),each=2,times=3),     year=rep(c(2009,2010),times=6),     bin=c(110,120,125,125,110,130,125,80,90,90,80,140),     shell_length=c(.4,4,1,2,.2,5,.4,4,.8,4,.3,4))  bin7<-ddply(data, .(area,year,sex,bin), summarize,n_bin=length(shell_length)) 

Error message: Error in .fun(piece, ...) : argument "by" is missing, with no default

I got this error message yesterday. I restarted R and reran the code and everything was fine. This morning I got the error message again and restarting R did not solve the problem.

I also tried to run some example code and got the same error message.

  # Summarize a dataset by two variables require(plyr) dfx <- data.frame(   group = c(rep('A', 8), rep('B', 15), rep('C', 6)),   sex = sample(c("M", "F"), size = 29, replace = TRUE),   age = runif(n = 29, min = 18, max = 54) )  # Note the use of the '.' function to allow # group and sex to be used without quoting ddply(dfx, .(group, sex), summarize,  mean = round(mean(age), 2),  sd = round(sd(age), 2)) 

R information

R version 3.2.1 (2015-06-18) Platform: i386-w64-mingw32/i386 (32-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1  locale: [1] LC_COLLATE=English_United States.1252  [2] LC_CTYPE=English_United States.1252    [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C                           [5] LC_TIME=English_United States.1252      attached base packages: [1] grid      stats     graphics  grDevices utils     datasets  [7] methods   base       other attached packages:  [1] Hmisc_3.17-0        ggplot2_1.0.1       Formula_1.2-1        [4] survival_2.38-1     car_2.0-26          MASS_7.3-40          [7] xlsx_0.5.7          xlsxjars_0.6.1      rJava_0.9-7         [10] plyr_1.8.3          latticeExtra_0.6-26 RColorBrewer_1.1-2  [13] lattice_0.20-31   

If someone could please explain why this is happening I would appreciate it.

Thanks

回答1:

As stated in Narendra's comment to the question, this error can be caused by loading other packages that have a function called summarize (or summarise) that does not work as the function in plyr. For instance:

library(plyr) library(Hmisc)  ddply(iris, "Species", summarize, mean_sepal_length = mean(Sepal.Length)) #> Error in .fun(piece, ...) : argument "by" is missing, with no default 

One solution is to call the correct function with :: and the correct namespace:

ddply(iris, "Species", plyr::summarize, mean_sepal_length = mean(Sepal.Length)) #> Species mean_sepal_length #> 1     setosa             5.006 #> 2 versicolor             5.936 #> 3  virginica             6.588 

Alternatively, one can detach the package that has the wrong function:

detach(package:Hmisc) ddply(iris, "Species", summarize, mean_sepal_length = mean(Sepal.Length)) #> Species mean_sepal_length #> 1     setosa             5.006 #> 2 versicolor             5.936 #> 3  virginica             6.588 

Finally, if one needs both packages and does not want to bother with ::, one can load them in the other order:

library(Hmisc) library(plyr)  ddply(iris, "Species", summarize, mean_sepal_length = mean(Sepal.Length)) #> Species mean_sepal_length #> 1     setosa             5.006 #> 2 versicolor             5.936 #> 3  virginica             6.588 


回答2:

I had a similar problem (with a different data set, but same error message), but I discovered that ddplyr used the UK spelling "summarise". Once I made the spelling change, code worked.

Here's the code I used. When I used the "z" spelling, I got the error message Error in .fun(piece, ...) : argument "by" is missing, with no default; but changing to "s" solved it.

library(plyr) ddply(InsectSprays,.(spray),summarise,sum=sum(count)) 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!