loop generate plots for variables in a data frame

限于喜欢 提交于 2019-12-12 18:06:19

问题


So I have a data frame, set up a bit like this:

Sample V1  V2 V3  Group1 Group2
bob    12  32  12  G1      G2
susan  43  23  54  G2      G2
mary   23  65  34  G1      G2

I am able to do a grouped boxplot of each variable (V1, V2, V3) individually, grouped by Group1 and Group2 variables, but my real dataset has WAY more variables, and will be tedious to code individually. Is there a way that I can loop the process, and automate plot generation and export? For loops are still a bit of an obscure topic for me.

Here is the code I use to generate an individual plot:

png(filename= "filename.jpg")
ggplot(aes(y=data$V1, x=data$Group1, fill=data$Group2), data=data) + geomboxplot()
dev.off()

Thanks!


回答1:


Here are several approaches for you. I'm guessing there is a duplicate, but if you're just starting out it's not always easy to apply those answers to your data.

library(reshape2)
library(ggplot2)
###create some data
set.seed(100)
n = 500

dat <- data.frame(sample = sample(LETTERS[1:10],n,T),
                  V1 = sample(50,n,T),
                  V2 = sample(50,n,T),
                  V3 = sample(50,n,T),
                  Group1 = paste0("G",sample(3,n,T)),
                  Group2 = paste0("G",sample(5,n,T)))

approach 1: melt and facet

dat_m <- melt(dat,measure.vars = c("V1","V2","V3"))

p1 <- ggplot(dat_m, aes(x = Group1,y = value,fill = Group2))+
  geom_boxplot() + facet_wrap(~variable)
p1

As you can see, this is not feasible when you have too many grouping variables.

approach 2: different plots/images per variable, still using the long data. I have split the long data by variable, and created a plot for each chunk. The current code plots to the console; file-saving code is commented out.

lapply(split(dat_m, dat_m$variable), function(chunk){
  myfilename <- sprintf("plot_%s.png", unique(chunk$variable))

  p <- ggplot(chunk, aes(x = Group1,y = value,fill = Group2)) +
    geom_boxplot() + labs(title = myfilename)
  p
#   #png(filename = myfilename)
#   print(p)
#   dev.off()

})

And a third approach is to use the strings of columns you're interested in:

#vector of columns you want to plot
mycols <- c("V1","V2","V3")

#plotting for each column. Not that I've put the 'fixed' variable
#inside aes in the main call to ggplot, and the 'varying' variable
#inside aes_string in the call to boxplot

lapply(mycols, function(cc){
  myfilename <- sprintf("plot_%s.png",cc)
  p <- ggplot(dat, aes(x = Group1,fill = Group2)) +
    geom_boxplot(aes_string(y = cc)) + labs(title = myfilename)
  p
  #   #png(filename = myfilename)
  #   print(p)
  #   dev.off()
})


来源:https://stackoverflow.com/questions/33689150/loop-generate-plots-for-variables-in-a-data-frame

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!