Gnuplot Histogram Cluster (Bar Chart) with One Line per Category

匿名 (未验证) 提交于 2019-12-03 01:10:02

问题:

Histogram Cluster / Bar Chart

I'm trying to generate the following histogram cluster out of this data file with gnuplot, where each category is represented in a separate line per year in the data file:

# datafile year   category        num_of_events 2011   "Category 1"    213 2011   "Category 2"    240 2011   "Category 3"    220 2012   "Category 1"    222 2012   "Category 2"    238 ... 

But I don't know how to do it with one line per category. I would be glad if anybody has got an idea how to do this with gnuplot.

Stacked Histogram Cluster / Stacked Bar Chart

Even better would be a stacked histogram cluster like the following, where the stacked sub categories are represented by separate columns in the datafile:

# datafile year   category        num_of_events_for_A    num_of_events_for_B 2011   "Category 1"    213                    30 2011   "Category 2"    240                    28 2011   "Category 3"    220                    25 2012   "Category 1"    222                    13 2012   "Category 2"    238                    42 ... 

Thanks a lot in advance!

回答1:

After some research, I came up with two different solutions.

Required: Splitting the data file

Both solutions require splitting up the data file into several files categorized by a column. Therefore, I've created a short ruby script, which can be found in this gist:

https://gist.github.com/fiedl/6294424

This script is used like this: In order to split up the data file data.csv into data.Category1.csv and data.Category2.csv, call:

# bash ruby categorize_csv.rb --column 2 data.csv  # data.csv # year   category   num_of_events_for_A   num_of_events_for_B "2011";"Category1";"213";"30" "2011";"Category2";"240";"28" "2012";"Category1";"222";"13" "2012";"Category2";"238";"42" ...  # data.Category1.csv # year   category   num_of_events_for_A   num_of_events_for_B "2011";"Category1";"213";"30" "2012";"Category1";"222";"13" ...  # data.Category2.csv # year   category   num_of_events_for_A   num_of_events_for_B "2011";"Category2";"240";"28" "2012";"Category2";"238";"42" ... 

Solution 1: Stacked Box Plot

Strategy: One data file per category. One column per stack. The bars of the histogram are plotted "manually" by using the "with boxes" argument of gnuplot.

Upside: Full flexibility concerning bar sizes, caps, colors, etc.

Downside: Bars have to be placed manually.

# solution1.gnuplot reset set terminal postscript eps enhanced 14  set datafile separator ";"  set output 'stacked_boxes.eps'  set auto x set yrange [0:300] set xtics 1  set style fill solid border -1  num_of_categories=2 set boxwidth 0.3/num_of_categories dx=0.5/num_of_categories offset=-0.1  plot 'data.Category1.csv' using ($1+offset):($3+$4) title "Category 1 A" linecolor rgb "#cc0000" with boxes, \      ''                   using ($1+offset):3 title "Category 2 B" linecolor rgb "#ff0000" with boxes, \      'data.Category2.csv' using ($1+offset+dx):($3+$4) title "Category 2 A" linecolor rgb "#00cc00" with boxes, \      ''                   using ($1+offset+dx):3 title "Category 2 B" linecolor rgb "#00ff00" with boxes 

The result looks like this:

Solution 2: Native Gnuplot Histogram

Strategy: One data file per year. One column per stack. The histogram is produced using the regular histogram mechanism of gnuplot.

Upside: Easier to use, since positioning has not to be done manually.

Downside: Since all categories are in one file, each category has the same color.

# solution2.gnuplot reset set terminal postscript eps enhanced 14  set datafile separator ";"  set output 'histo.eps' set yrange [0:300]  set style data histogram set style histogram rowstack gap 1 set style fill solid border -1 set boxwidth 0.5 relative  plot newhistogram "2011", \        'data.2011.csv' using 3:xticlabels(2) title "A" linecolor rgb "red", \        ''              using 4:xticlabels(2) title "B" linecolor rgb "green", \      newhistogram "2012", \        'data.2012.csv' using 3:xticlabels(2) title "" linecolor rgb "red", \        ''              using 4:xticlabels(2) title "" linecolor rgb "green", \      newhistogram "2013", \        'data.2013.csv' using 3:xticlabels(2) title "" linecolor rgb "red", \        ''              using 4:xticlabels(2) title "" linecolor rgb "green" 

The result looks like this:

References



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!