Make a histogram clearer by assigning names to different segments of each bar in R

若如初见. 提交于 2019-12-12 14:27:42

问题


Assume that I have a data frame with two columns and 19 rows (see below); The left column is the name of cell lines and the right one is the expression of gene ZEB1 in corresponding cell line.

    CellLines   ZEB1
    600MPE  2.8186
    AU565   2.783
    BT20    2.7817
    BT474   2.6433
    BT483   2.4994
    BT549   3.035
    CAMA1   2.718
    DU4475  2.8005
    HBL100  2.6745
    HCC38   3.2884
    HCC70   2.597
    HCC202  2.8557
    HCC1007 2.7794
    HCC1008 2.4513
    HCC1143 2.8159
    HCC1187 2.6372
    HCC1428 2.7327
    HCC1500 2.7564
    HCC1569 2.8093

I've drawn a histogram for this data using simple code below:

hist(Heiser$ZEB1[1:19], breaks=50, col="grey")

and it gives me the histogram whose x axis is the amount of gene expression and the y axis is the frequency of that expression among cell lines; however, I would like to add the name of cell lines to their specific positions on histogram... How can I do that?

Thanks in advance for your time on answering this :-) Best.


回答1:


One alternative is to use text to insert labels into the plot:

hist(Heiser$ZEB1[1:19], breaks=50, col="grey")
text(Heiser$ZEB1, 2, labels= Heiser$CellLines, srt=90)

Edit:

Positioning labels in the same category one over another:

Heiser_hist <- hist(Heiser$ZEB1[1:19], breaks=50, col="grey")
Heiser$cut <- cut(Heiser$ZEB1, breaks=Heiser_hist$breaks)
library(dplyr)
Heiser <- Heiser %>% group_by(cut) %>% mutate(pos = seq(from=1, to=2, length.out=length(ZEB1)))
with(Heiser, text(ZEB1, pos, labels=CellLines, srt=45, cex=0.9))

You could try the text without inclination changing srt, but the overplotting is worse in that case. You could also play with the x axis to reduce overplottig.




回答2:


You are going to have a problem with overlapping labels (not sure what you want to do there) but

hist(Heiser$ZEB1[1:19], breaks=50, col="grey", xaxt="n")
axis(1,Heiser$ZEB1, Heiser$CellLines )

I think gives you what you're after based on the description.

Are you sure you don't want a bar plot instead? Because with a histogram, one bar does not represent one observation. The histogram is an attempt to estimate the underlying probability density function for continuous variables.



来源:https://stackoverflow.com/questions/26112990/make-a-histogram-clearer-by-assigning-names-to-different-segments-of-each-bar-in

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!