Dynamic R dataframes - change yes/no responses to 1/0

*爱你&永不变心* 提交于 2019-12-25 11:50:53

问题


I use an API call to LimeSurvey to get data into a Shiny R app I'm working on. I then manipulate the dataframe so that I have only the responses given by a certain individual over time. The dataframe can look like this:

Appetite <- c("No","Yes","No","No","No","No","No","No","No")
Dental.Health <- c("No","Yes","No","No","No","No","Yes","Yes","No")
Dry.mouth <- c("No","Yes","Yes","Yes","Yes","No","Yes","Yes","No")
Mouth.opening <- c("No","No","Yes","Yes","Yes","No","Yes","Yes","No")
Pain.elsewhere <- c("No","Yes","No","No","No","No","No","No","No")
Sleeping <- c("No","No","No","No","No","Yes","No","No","No")
Sore.mouth <- c("No","No","Yes","Yes","No","No","No","No","No")
Swallowing <- c("No","No","No","No","Yes","No","No","No","No")
Cancer.treatment <- c("No","No","Yes","Yes","No","Yes","No","No","No")
Support.for.my.family <- c("No","No","Yes","Yes","No","No","No","No","No")
Fear.of.cancer.coming.back <- c("No","No","Yes","Yes","No","No","Yes","No","No")
Intimacy  <- c("Yes","No","No","No","No","No","No","No","No")
Dentist   <- c("No","Yes","No","No","No","No","No","No","No")
Dietician <- c("No","No","Yes","Yes","No","No","No","No","No")
Date.submitted <- c("2002-07-25 00:00:00",
                 "2002-09-05 00:00:00",
                 "2003-01-09 00:00:00",
                 "2003-01-09 00:00:00",
                 "2003-07-17 00:00:00",
                 "2003-11-06 00:00:00",
                 "2004-12-17 00:00:00",
                 "2005-06-03 00:00:00",
                 "2005-12-17 00:00:00")

theDataFrame <- data.frame( Date.submitted,
                            Appetite,
                            Dental.Health,
                            Dry.mouth,
                            Mouth.opening,
                            Pain.elsewhere,
                            Sleeping,
                            Sore.mouth,
                            Swallowing,
                            Cancer.treatment,
                            Support.for.my.family,
                            Fear.of.cancer.coming.back,
                            Intimacy,
                            Dentist,
                            Dietician)

To be clear, this dataframe could contain more (or fewer) observations of more (or fewer) variables than the example above.

My goal is to make a dynamic histogram that looks like the following:

library(dplyr)
library(ggplot2)
library(tidyr)

df <- data.frame(timeline = Sys.Date() - 1:10,
                 q3 = sample(c("Yes", "No"), size = 10, replace = T),
                 q4 = sample(c("Yes", "No"), size = 10, replace = T),
                 q5 = sample(c("Yes", "No"), size = 10, replace = T),
                 q6 = sample(c("Yes", "No"), size = 10, replace = T),
                 q7 = sample(c("Yes", "No"), size = 10, replace = T),
                 q8 = sample(c("Yes", "No"), size = 10, replace = T),

                 stringsAsFactors = F) %>%
    mutate(q3 = ifelse(q3 == "Yes", 1, 0),
           q4 = ifelse(q4 == "Yes", 1, 0),
           q5 = ifelse(q5 == "Yes", 1, 0),
           q6 = ifelse(q6 == "Yes", 1, 0),
           q7 = ifelse(q7 == "Yes", 1, 0),
           q8 = ifelse(q8 == "Yes", 1, 0)

    ) %>%
    gather(key = question, value = value, q3, q4, q5, q6, q7, q8)

g <- ggplot(df, aes(x = timeline, y = value, fill = question)) +
    geom_bar(stat = "identity")

g 

I think I will need to use library(lubridate) for the timeline, as the entire dataframe is plain text. I deal with the '.' in the column names like this:

myColNames <- colnames(theDataFrame)

myNames <- myColNames

myNames <- gsub("^X\\.\\.", "", myNames)
myNames <- gsub("\\.", " ", myNames)
names(theDataFrame) <- myNames # items in myChoices get "labels" from myNames

But the most challenging aspect is getting this to work dynamically. The datasets will only contain Date.submitted and (x)number of additional columns that will only be "Yes" or "No"

I hope I've given enough information (this is my first question on Stack Exchange!)


回答1:


We can update it using base R

theDataFrame[-1] <- +(theDataFrame[-1]=="Yes")

Or with lapply when the dataset is big

theDataFrame[-1] <- lapply(theDataFrame[-1], function(x) as.integer(x=="Yes"))



回答2:


You could also use dplyr::mutate_all and purrr::map

Note: I used stringsAsFactors = F in theDataFrame

theDataFrame <- data.frame( Date.submitted,
                            Appetite,
                            Dental.Health,
                            Dry.mouth,
                            Mouth.opening,
                            Pain.elsewhere,
                            Sleeping,
                            Sore.mouth,
                            Swallowing,
                            Cancer.treatment,
                            Support.for.my.family,
                            Fear.of.cancer.coming.back,
                            Intimacy,
                            Dentist,
                            Dietician, stringsAsFactors = F)

-Create a function to do the conversion you want, for instance:

ConvertYesNo<- function(x){
  if(x=="Yes") y <- as.integer(1)
  else if (x=="No") y <- as.integer(0)
  else y <- x

  return(y)
}

-Use it with mutate_all, which considers all the columns or pick the columns you want using mutate_at. And map the function as follows:

theDataFramex <- theDataFrame %>% 
  mutate_all(funs(map_chr(.,ConvertYesNo)))

> head(theDataFramex,3 )
       Date.submitted Appetite Dental.Health Dry.mouth Mouth.opening Pain.elsewhere Sleeping
1 2002-07-25 00:00:00        0             0         0             0              0        0
2 2002-09-05 00:00:00        1             1         1             0              1        0
3 2003-01-09 00:00:00        0             0         1             1              0        0
  Sore.mouth Swallowing Cancer.treatment Support.for.my.family Fear.of.cancer.coming.back
1          0          0                0                     0                          0
2          0          0                0                     0                          0
3          1          0                1                     1                          1
  Intimacy Dentist Dietician
1        1       0         0
2        0       1         0
3        0       0         1


来源:https://stackoverflow.com/questions/42506600/dynamic-r-dataframes-change-yes-no-responses-to-1-0

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!