boxplot displays incorrect when coverting from factor to numeric

不打扰是莪最后的温柔 提交于 2019-12-11 04:35:59

问题


My graph displays correctly without using scale. I want to have it looks better so I convert factor to numeric then using scale_x_continuous. However, the graph looks incorrect when I convert from factor to numeric (How to convert a factor to an integer\numeric without a loss of information?). I can't use scale without converting to numeric. Please run a sample code below with and without these lines ( main$U <- as.numeric(as.character(main$U)), and + scale_x_continuous(name="Temperature", limits=c(0, 160)) ). Thank you.

library("ggplot2")
library("plyr")

df<-data.frame(U = c(25, 25, 25, 25, 25, 85, 85, 85, 125, 125), 
               V =c(1.03, 1.06, 1.1,1.08,1.87,1.56,1.75,1.82, 1.85, 1.90), 
               type=c(2,2,2,2,2,2,2,2,2,2)) 

df1<-data.frame(U = c(25, 25,25,85, 85, 85, 85, 125, 125,125), 
                V =c(1.13, 1.24,1.3,1.17, 1.66,1.76,1.89, 1.90, 1.95,1.97), 
                type=c(5,5,5,5,5,5,5,5,5,5)) 

df2<-data.frame(U = c(25, 25, 25, 85, 85,85,125, 125,125), 
                V =c(1.03, 1.06, 1.56,1.75,1.68,1.71,1.82, 1.85,1.88), 
                type=c(7,7,7,7,7,7,7,7,7))

main <- rbind(df,df1,df2)
main$type <- as.factor(main$type)
main <- transform(main, type = revalue(type,c("2"="type2", "5"="type5", "7" = "type7")))
main$U <- as.factor(main$U)
main$U <- as.numeric(as.character(main$U))

ggplot(main, aes(U, V,color=type)) + 
  geom_boxplot(width=0.5/length(unique(main$type)), size=.3, position="identity") + 
  scale_x_continuous(name="Temperature", limits=c(0, 160))  

回答1:


You have to specify the group in your call to geom_boxplot, and to keep the legend you can use color=factor(U) (i.e, converting U back). To not lose information on the groups that have the same x-values, I think it is best to create a new grouping column first. You take all unique pairs of U and type and create a new variable based on which row falls into which of these pairs.

main$U <- as.character(main$U)
main$type <- as.character(main$type)

grp_keys <- unique(as.matrix(main[, c("U", "type")]))
grp_inds <- 1:nrow(grp_keys)

main$grps <- apply(main, 1, function(x) {
  grp_inds[colSums(as.character(x[c("U", "type")]) == t(grp_keys)) == length(c("U", "type"))]
  })

Then, plotting (width adjusted because it looks very small with higher range),

main$U <- as.numeric(as.character(main$U))
ggplot(main, aes(U, V,color=type)) + 
  geom_boxplot(aes(group = grps, color = type), width=20/length(unique(main$type)), size=.3, position="identity") +
  scale_x_continuous(name="Temperature", limits=c(0, 160))



来源:https://stackoverflow.com/questions/48995692/boxplot-displays-incorrect-when-coverting-from-factor-to-numeric

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!