r

Define multiple values as missing in a data frame

落爺英雄遲暮 提交于 2021-02-09 15:02:53
问题 How do I define multiple values as missing in a data frame in R? Consider a data frame where two values, "888" and "999", represent missing data: df <- data.frame(age=c(50,30,27,888),insomnia=c("yes","no","no",999)) df[df==888] <- NA df[df==999] <- NA This solution takes one line of code per value representing missing data. Do you have a more simple solution for situations where the number of values representing missing data is high? 回答1: Here are three solutions: # 1. Data set df <- data

ggsignif package error stat_signif requires the following missing aesthetics: y

半腔热情 提交于 2021-02-09 14:28:24
问题 This is an invented example of my data: x <- c("Control", "Case", "Case", "Case", "Control", "Control", "Control", "Case", "Case", "Case") y <- c("Dead", "Dead", "Dead", "Alive", "Alive", "Dead", "Dead", "Dead", "Alive", "Dead") I'm trying to represent this data in the form of a bar plot and then indicate a statistically significant difference in the proportion of alive and dead patients between the two experimental groups (cases and controls). I performed a Pearson's chi square test and the

ggsignif package error stat_signif requires the following missing aesthetics: y

笑着哭i 提交于 2021-02-09 14:27:52
问题 This is an invented example of my data: x <- c("Control", "Case", "Case", "Case", "Control", "Control", "Control", "Case", "Case", "Case") y <- c("Dead", "Dead", "Dead", "Alive", "Alive", "Dead", "Dead", "Dead", "Alive", "Dead") I'm trying to represent this data in the form of a bar plot and then indicate a statistically significant difference in the proportion of alive and dead patients between the two experimental groups (cases and controls). I performed a Pearson's chi square test and the

Access the column names in the `mutate_at` to use it for subseting a list

非 Y 不嫁゛ 提交于 2021-02-09 13:58:33
问题 I am trying to recode several variables but with different recode schemes. The recoding scheme is saved in a list where each element is a named vector of the form old = new . Each element is the recoding scheme for each variable in the data frame I am using the mutate_at function and the recode . I think that the problem is that I cannot extract the variable name to use it to get the correct recoding scheme from the list I tried deparse(substitute(.)) as in here and also this didn;t help Also

Access the column names in the `mutate_at` to use it for subseting a list

拟墨画扇 提交于 2021-02-09 13:57:29
问题 I am trying to recode several variables but with different recode schemes. The recoding scheme is saved in a list where each element is a named vector of the form old = new . Each element is the recoding scheme for each variable in the data frame I am using the mutate_at function and the recode . I think that the problem is that I cannot extract the variable name to use it to get the correct recoding scheme from the list I tried deparse(substitute(.)) as in here and also this didn;t help Also

How to combine multiple data frame columns in R

送分小仙女□ 提交于 2021-02-09 12:14:03
问题 I have a .csv file with demographic data for my participants. The data are coded and downloaded from my study database (REDCap) in a way that each race has its own separate column. That is, each participant has a value in each of these columns (1 if endorsed, 0 if unendorsed). It looks something like this: SubjID Sex Age White AA Asian Other 001 F 62 0 1 0 0 002 M 66 1 0 0 0 I have to use a roundabout way to get my demographic summary stats. There's gotta be a simpler way to do this. My

How to combine multiple data frame columns in R

血红的双手。 提交于 2021-02-09 12:12:39
问题 I have a .csv file with demographic data for my participants. The data are coded and downloaded from my study database (REDCap) in a way that each race has its own separate column. That is, each participant has a value in each of these columns (1 if endorsed, 0 if unendorsed). It looks something like this: SubjID Sex Age White AA Asian Other 001 F 62 0 1 0 0 002 M 66 1 0 0 0 I have to use a roundabout way to get my demographic summary stats. There's gotta be a simpler way to do this. My

Put row and column titles using grid.arrange in R

徘徊边缘 提交于 2021-02-09 11:58:12
问题 The data for the ggplots: set.seed(0) library(ggplot2) library(gridExtra) c <- list() for (k in 1:9) c[[k]] <- ggplot(data.frame(x=1:10,y=rnorm(10)),aes(x=x,y=y))+geom_line() grid.arrange (c[[1]],c[[2]],c[[3]],c[[4]],c[[5]] ,c[[6]],c[[7]],c[[8]],c[[9]],ncol=3, nrow=3, widths = c(4,4,4) ,heights = c(4,4,4)) I want titles for each row and each column. The shape of the output would be something like this: CTitle 1 CTitle 2 CTitle 3 RTitle1 plot1 plot2 plot3 RTitle2 plot4 plot5 plot6 RTitle3

R - mlr: Is there a easy way to get the variable importance of tuned support vector machine models in nested resampling (spatial)?

烂漫一生 提交于 2021-02-09 11:46:24
问题 I am trying to get the variable importance for all predictors (or variables, or features) of a tuned support vector machine (svm) model using e1071::svm through the mlr -package in R . But I am not sure, if I am doing the assessment right. Well, at first the idea: To get an honest tuned svm-model, I am following the nested-resampling tutorial using spatial n-fold cross-validation ( SpRepCV ) in the outer loop and spatial cross-validation ( SpCV ) in the inner loop. As tuning parameter gamma

Dendrogram with Corrplot (R)

好久不见. 提交于 2021-02-09 11:42:19
问题 Does anyone have a method to adorn an R corrplot correlation plot with a dendrogram? 回答1: The closest solution I know of is to use a heatmap on a correlation matrix, for example you could also use gplots::heatmap.2. Here is how to do it using the heatmaply R package, which also offers an interactive interface where you can zoom-in and get a tooltip when hovering over the cells: # for the first time: # install.packages("heatmaply") library(heatmaply) my_cor <- cor(mtcars) heatmaply_cor(my_cor)