Problems using dplyr in a function (group_by)

喜夏-厌秋 提交于 2019-12-07 02:37:06

问题


I want to use dplyr for some data manipulation. Background: I have a survey weight and a bunch of variables (mostly likert-items). I want to sum the frequencies and percentages per category with and without survey weight.

As an example, let us just use frequencies for the gender variable. The result should be this:

 gender freq    freq.weighted
    1       292     922.2906
    2       279     964.7551
    9         6      21.7338

I will do this for many variables. So, i decided to put the dplyr-code inside a function, so i only have to change the variable and type less.

#exampledata
gender<-c("2","2","1","2","2","2","2","2","2","2","2","2","1","1","2","2","2","2","2","2","1","2","2","2","2","2","2","2","2","2")
survey_weight<-c("2.368456","2.642901","2.926698","3.628653","3.247463","3.698195","2.776772","2.972387","2.686365","2.441820","3.494899","3.133106","3.253514","3.138839","3.430597","3.769577","3.367952","2.265350","2.686365","3.189538","3.029999","3.024567","2.972387","2.730978","4.074495","2.921552","3.769577","2.730978","3.247463","3.230097")
test_dataframe<-data.frame(gender,survey_weight)

#function
weighting.function<-function(dataframe,variable){
  test_weighted<- dataframe %>% 
    group_by_(variable) %>% 
    summarise_(interp(freq=count(~weight)),
               interp(freq_weighted=sum(~weight)))
  return(test_weighted)
}

result_dataframe<-weighting.function(test_dataframe,"gender")

#this second step was left out in this example:
#mutate_(perc=interp(~freq/sum(~freq)*100),perc_weighted=interp(~freq_weighted/sum(~freq_weighted)*100))

This leads to the following Error-Message:

Error in UseMethod("group_by_") : 
  no applicable method for 'group_by_' applied to an object of class "formula" 

I have tried a lot of different things. First, I used freq=n() to count the frequencies, but I always got an Error (i checked, that plyr was loaded before dplyr and not afterwards - it also didn´t work.).

Any ideas? I read the vignette on standard evaluation. But, i always run into problems and have no idea what could be a solution.


回答1:


I think you have a few nested mistakes which is causing you problems. The biggest one is using count() instead summarise(). I'm guessing you wanted n():

weighting.function <- function(dataframe, variable){
  dataframe %>% 
    group_by_(variable) %>% 
    summarise_(
      freq = ~n(),
      freq_weighted = ~sum(survey_weight)
    )
}

weighting.function(test_dataframe, ~gender)

You also had a few unneeded uses of interp(). If you do use interp(), the call should look like freq = interp(~n()), i.e. the name is outside the call to interp, and the thing being interpolated starts with ~.



来源:https://stackoverflow.com/questions/28157919/problems-using-dplyr-in-a-function-group-by

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!