R function contain plyr--ddply(): parameters in ddply() cannot be past correctly

你离开我真会死。 提交于 2019-12-13 02:57:02

问题


my data as follows:

>df2
   id     calmonth       product
1 101       01           apple
2 102       01           apple&nokia&htc
3 103       01           htc
4 104       01           apple&htc
5 104       02           nokia

para=c('apple','htc','nokia')

I wanna get the number of ids who's product contain apple&htc,apple&nokia,etc. I make a function as follows:

xandy=function(a,b){
        ddply(df2,.(calmonth),summarise,
                              csum=length(grep(paste0('apple','.*','htc'),product)),
                              coproduct=paste0('apple','&','htc')
             )
                   }

This function give me a perfect result as follows:

> xandy(para[1],para[3])
  calmonth csum   coproduct
1       01    2   apple&htc
2       02    0   apple&htc

But What i need are not only apple&htc,butapple&nokiaetc,so I alter apple and htcthemselves to parameters,new likely function like this:

xandy=function(a,b){
        ddply(df2,.(calmonth),summarise,
                              csum=length(grep(paste0(a,'.*',b),product)),
                              coproduct=paste0(a,'&',b)
             )
                   }

See the differences? I have altered 'apple' ,'htc' to a,b(parameters) But it is not at all what I want.

> xandy(para[1],para[3])

Error in eval(expr, envir, enclos) : argument is missing, with no default In addition: Warning message: In grep(paste0(a, ".*", b), product) : argument 'pattern' has length > 1 and only the first element will be used


回答1:


A straightforward solution to your problem might be:

ddply(df2, .(calmonth), summarise, 
               apple = as.numeric(length(product == "apple")),
               apple.nokia.htc = as.numeric(length(product == "apple&nokia&htc")),
               htc = as.numeric(length(product == "htc")),
               apple.htc = as.numeric(length(product == "apple&htc"))
)



回答2:


With the help of MengChen and others, I get a straightforward answer.

xandy=function(a,b){
myStr_match=paste0(a,'.*',b)
myStr_match1=paste0(b,'.*',a)
ajoinb_match=paste0(a,'&',b)
ddply(df2,.(calmonth),function(data,myStr,myStr1,ajoinb){
summarise(data,
          csum=max(length(grep(myStr,product)),length(grep(myStr1,product))),
          coproduct=ajoinb)
  },myStr=myStr_match,myStr1=myStr_match1,ajoinb=ajoinb_match)
}

Maybe this is not the best answer, but it does work anyway.



来源:https://stackoverflow.com/questions/21041252/r-function-contain-plyr-ddply-parameters-in-ddply-cannot-be-past-correctly

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!