boxplot

How to change x-axis tick label names, order and boxplot colour using R ggplot?

China☆狼群 提交于 2019-11-27 19:08:00
I have a folder containing csv files, each with two columns of data e.g.: 0,red 15.657,red 0,red 0,red 4.429,red 687.172,green 136.758,green 15.189,red 0.152,red 23.539,red 0.348,red 0.17,blue 0.171,red 0,red 61.543,green 0.624,blue 0.259,red 338.714,green 787.223,green 1.511,red 0.422,red 9.08,orange 7.358,orange 25.848,orange 29.28,orange I am using the following R code to generate the boxplots: files <- list.files(path="D:/Ubuntu/BoxPlots/test/", pattern=NULL, full.names=F, recursive=FALSE) files.len<-length(files) col_headings<-c("RPKM", "Lineage") for (i in files){ i2<-paste(i,"png", sep=

change thickness median line geom_boxplot()

我的梦境 提交于 2019-11-27 18:00:20
问题 I want to do some modifications of a geom_boxplot(). Because my boxplots are really "small" sometimes (see yellow and green clade in the graphic here) i want to highlight the median even more. so is it possible to adjust the thickness of the median line? 回答1: This solution is not obvious from the documentation, but luckily does not require us to edit the source code of ggplot2 . After digging through the source of ggplot2 I found that the thickness of the median line is controlled by the

In ggplot2, what do the end of the boxplot lines represent?

荒凉一梦 提交于 2019-11-27 17:58:54
I can't find a description of what the end points of the lines of a boxplot represent. For example, here are point values above and below where the lines end. (I realize that the top and bottom of the box are 25th and 75th percentile, and the centerline is the 50th). I assume, as there are points above and below the lines that they do not represent the max/min values. csgillespie The "dots" at the end of the boxplot represent outliers. There are a number of different rules for determining if a point is an outlier, but the method that R and ggplot use is the "1.5 rule". If a data point is: less

Sort boxplot by mean (and not median) in R

拥有回忆 提交于 2019-11-27 17:55:14
问题 I have a simple boxplot, showing the distribution of a score for factor TYPE: myDataFrame = data.frame( TYPE=c("a","a","b","b","c","c"), SCORE=c(1,1,2,3,2,1) ) boxplot( SCORE~TYPE, data=myDataFrame ) The various types are shown in the order they have in the data frame. I'd like to sort the boxplot by the mean of SCORE in each TYPE (in the example above, the order should be a,c,b ). Any hint? 回答1: This is a job for reorder(): myDataFrame$TYPE <- with(myDataFrame, reorder(TYPE, SCORE, mean))

Is it possible to draw a matplotlib boxplot given the percentile values instead of the original inputs?

有些话、适合烂在心里 提交于 2019-11-27 17:53:11
问题 From what I can see, boxplot() method expects a sequence of raw values (numbers) as input, from which it then computes percentiles to draw the boxplot(s). I would like to have a method by which I could pass in the percentiles and get the corresponding boxplot . For example: Assume that I have run several benchmarks and for each benchmark I've measured latencies ( floating point values ). Now additionally, I have precomputed the percentiles for these values. Hence for each benchmark, I have

Multiple boxplots using ggplot

自闭症网瘾萝莉.ら 提交于 2019-11-27 16:11:11
I have a dataframe that looks like the one attached, with 6 columns and 1000 rows (tab separated). The column headings (0,30,60,120,240 and 360) are a time series (with 0 representing 0 mins, 30 meaning 30 mins and so on). I'd like to create 6 boxplots corresponding to the columns using ggplot2 in a single plot, keeping in mind that they need to be spaced based on the time difference. It seems I would need to melt the columns, but cant figure out a way to do so. Any help would be much appreciated. 0 30 60 120 240 360 1 1 NA NA NA 1 1 2 NA NA NA NA NA NA 3 NA NA 1 1 1 1 4 0.5 0.21 0.15 1 0.38 0

Generate ggplot2 boxplot with different colours for multiple groups

萝らか妹 提交于 2019-11-27 15:27:17
问题 I'm fairly new to R and ggplot. I'm trying to generate a boxplot sorted by two variables. In my case Species and Experiment. What I got so far by using ggplot(DF, aes(Species, Protein, fill=Experiment, dodge=Experiment)) + stat_boxplot(geom ='errorbar')+ geom_boxplot() are boxplots of my species and each species has 2 bars, one for each experiment. My question is now, is it possible to change the colours in this way, that I have different colours per species and lets say, different shading of

Matplotlib boxplot using precalculated (summary) statistics

若如初见. 提交于 2019-11-27 13:18:21
问题 I need to do a boxplot (in Python and matplotlib) but I do not have the original "raw" data. What I have are precalculated values for max, min, mean, median and IQR (normal distribution) but still I'd like to do a boxplot. Of course plotting outliers isn't possible, but beside that I guess all information is there. I've search all over to find an answer without success. The closest I've come is the same question but for R (which I'm unfamiliar with). See Is it possible to plot a boxplot from

python matplotlib filled boxplots

吃可爱长大的小学妹 提交于 2019-11-27 13:15:01
问题 Does anyone know if we can plot filled boxplots in python matplotlib? I've checked http://matplotlib.org/api/pyplot_api.html but I couldn't find useful information about that. 回答1: The example that @Fenikso shows an example of doing this, but it actually does it in a sub-optimal way. Basically, you want to pass patch_artist=True to boxplot . As a quick example: import matplotlib.pyplot as plt import numpy as np data = [np.random.normal(0, std, 1000) for std in range(1, 6)] plt.boxplot(data,

How to change order of boxplots when using ggplot2?

你。 提交于 2019-11-27 12:45:19
This question follows from this other one . I was unable to implement answers there. Define: df2 <- data.frame(variable=rep(c("vnu.shr","vph.shr"),each=10), value=seq(1:20)) Plot: require(ggplot2) qplot(variable,value, data=df2,geom="boxplot")+ geom_jitter(position=position_jitter(w=0.1,h=0.1)) I would like to have the boxplots in the reverse order (e.g. one in right on left and so on). I have tried various ways of reordering the factors using levels , ordered , relevel , rev and so on, but I simply cannot seem to get the syntax right. Have you tried this: df2$variable <- factor(df2$variable,