R Frequency Table of Likert Data

人走茶凉 提交于 2019-12-06 09:01:16
q1<-c(2,2,3,3,3,4,4,4,5,5)
q2<-c(2,3,3,4,4,4,4,5,5,5)
q3<-c(2,2,2,3,4,4,4,5,5,5)
df<-data.frame(q1,q2,q3)

library(expss)
# add value lables for preserving empty categories
val_lab(df) = autonum(1:5)
res = df
for(each in colnames(df)){
    res = res %>% 
        tab_cells(list(each)) %>% 
        tab_cols(vars(each)) %>% 
        tab_stat_rpct(total_row_position = "none")
}


res = res %>% tab_pivot() 
# add percentage sign
recode(res[,-1]) = other ~ function(x) ifelse(is.na(x), NA, paste0(round(x, 0), "%"))
res

# |    |  1 |   2 |   3 |   4 |   5 |
# | -- | -- | --- | --- | --- | --- |
# | q1 |    | 20% | 30% | 30% | 20% |
# | q2 |    | 10% | 20% | 40% | 30% |
# | q3 |    | 30% | 10% | 30% | 30% |

If you use knitr then the following code will be helpful:

library(knitr)
res %>% kable
Phil

I wouldn't advise you doing this because it is not useful for later wrangling, but in order to have it exactly as asked...

for (i in seq_along(names(df))) {
 assign(paste0("x",i), prop.table(table(factor(df[[i]], levels = 1:5))))
}

result <- rbind(x1, x2, x3)
rownames(result) <- names(df)

as.data.frame(matrix(
sprintf("%.0f%%", result*100), 
nrow(result), 
dimnames = dimnames(result)
))

   1   2   3   4   5
q1 0% 20% 30% 30% 20%
q2 0% 10% 20% 40% 30%
q3 0% 30% 10% 30% 30%

The last bit of code is as suggested here.

kishan

It is hard to give a precise answer without knowing what the data looks like. However assuming I have some sort of data frame already, I would start with creating functions that would systematically transform the data into the plots. I would also use ggplot2 rather than the base R graphics as it would be more flexible.


Suppose you had data frames for each survey. From my experience then you may have rows with one column that indicates a question, and another with the given response to that question.

That is:

survey = data.frame(question = factor(rep(1:6,4)),response = factor(c(1:5,sample(1:5,19, replace = TRUE))))

Then you can create a function that calculates the percent for each response in a question given the data frame above

library(plyr)

# Assumes survey has columns question and response
calculate_percent = function(survey){
  ddply(survey, ~question, function(rows){ 

  total_responses = nrow(rows)

  response_percent =  ddply(rows, ~response, function(rows_response){
    count_response = nrow(rows_response)
    data.frame(response = unique(rows_response$response), percent = (count_response/total_responses)*100)
  })

  data.frame(question = unique(rows$question), response_percent)

  })
}

Then you can create a function that makes a plot given a data frame like the one defined above.

library(ggplot2)
library(scales)

percentage_plot = function(survey){

  calculated_percentages = calculate_percent(survey)

  ggplot(calculated_percentages,aes(x = question, y = percent)) + 
    geom_bar(aes(fill = response),stat = "identity",position = "dodge") +
    scale_y_continuous(labels = percent)
}

Which can finally be used with the call

percentage_plot(survey)

Then since you have multiple surveys you can generalize with additional functions which would systematically process the data in a similar manner as above.

Also you could have done the above plots in facets rather than the grouped box plots here. However since you have more than one survey maybe you want to use facets at that level.


References:

ggplot percentage

ggplot grouped bar plot


Sorry I started writing my example before your edit, hopefully you can still customize to your use case.

Actually it seems that I misunderstood your question and answered a different one.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!