histogram

How to get Histogram of all columns in a large CSV / RDD[Array[double]] using Apache Spark Scala?

五迷三道 提交于 2020-01-11 01:43:11
问题 I am trying to calculate Histogram of all columns from a CSV file using Spark Scala. I found that DoubleRDDFunctions supporting Histogram. So I coded like following for getting histogram of all columns. Get column count Create RDD[double] of each column and calculate Histogram of each RDD using DoubleRDDFunctions var columnIndexArray = Array.tabulate(rdd.first().length) (_ * 1) val histogramData = columnIndexArray.map(columns => { rdd.map(lines => lines(columns)).histogram(6) }) Is it a good

save a pandas.Series histogram plot to file

女生的网名这么多〃 提交于 2020-01-10 06:49:08
问题 In ipython Notebook, first create a pandas Series object, then by calling the instance method .hist(), the browser displays the figure. I am wondering how to save this figure to a file (I mean not by right click and save as, but the commands needed in the script). 回答1: Use the Figure.savefig() method, like so: ax = s.hist() # s is an instance of Series fig = ax.get_figure() fig.savefig('/path/to/figure.pdf') It doesn't have to end in pdf , there are many options. Check out the documentation.

How to make 3D histogram in R

只愿长相守 提交于 2020-01-09 19:04:43
问题 This is my goal: Plot the frequency of y according to x in the z axis. These are my problems: I have a two columns array ( x and y ) and need to divide x into classes (p.ex. 0.2 ou 0.5) and calculate the frequency of y for each class of x . The plot should appear like a x-y plot in the "ground" plan and the frequency in the z axis. It could be like a surface or a 3D histogram. I tried to make it using the hist3D function of plot3D package but I don't know what I am doing wrong. This is an

hist distribution using ggplot2

北战南征 提交于 2020-01-07 07:58:10
问题 I plot a vector distribution with qplot that I don't know its distribution in advance, it's calculated in a function. I just know that the x values are between 0 and 1. I use the below command line and get the attached histogram. As the distribution is jammed, how can I make it more spread so that the distribution becomes clear? which parameters to use or other functions that produce histogram? Moreover how to color so that the bins becomes more distiguishable? keeping the color legend may or

R/ggplot2 - Overlapping labels on facet_grid

蹲街弑〆低调 提交于 2020-01-07 03:58:29
问题 Folks, I am plotting histograms using geom_histogram and I would like to label each histogram with the mean value (I am using mean for the sake of this example). The issue is that I am drawing multiple histograms in one facet and I get labels overlapping. This is an example: library(ggplot2) df <- data.frame (type=rep(1:2, each=1000), subtype=rep(c("a","b"), each=500), value=rnorm(4000, 0,1)) plt <- ggplot(df, aes(x=value, fill=subtype)) + geom_histogram(position="identity", alpha=0.4) plt <-

How to dynamically update HistogramDataset with two series in jfreechart?

我们两清 提交于 2020-01-07 03:49:46
问题 I want to dynamically update two separate series in a jfree chart histogram. When i look at HistogramDataset it doesn't seem like there is a method for that. Is this possible? I know it can be done in SimpleHistogramDataset but I need to have two series on this chart. 回答1: Some alternatives: Replace the HistogramDataset with each update: chart.getXYPlot().setDataset(newDataset); Add a second SimpleHistogramDataset and XYItemRenderer to the plot: SimpleHistogramDataset newDataset =

How to adapt HOG features vector to linear svm input

戏子无情 提交于 2020-01-07 02:34:40
问题 I'm using HOG in order to extract a set of features trough an Image A. the HOG returns a features' vector of 1xN elements. However the linear SVM accept only 2 features for each sample i.e the training data matrix's size is Mx2. so how i can adapt the HOG vector to be trained on linear SVM. Please help me. Thanks 回答1: What do you mean by "the linear SVM accept only 2 features for each sample"? You may be confused on how the SVM function accepts its training data. Here's a quick example of how

Multi-group histogram with group-specific frequencies

假装没事ソ 提交于 2020-01-06 15:42:11
问题 First off, I've already read the following thread: ggplot2 - Multi-group histogram with in-group proportions rather than frequency I followed the ddply suggestion and it didn't seem to work for my data. Logically the code should work perfectly on my dataset and I have no idea what I'm doing wrong. Overall: I'd like to make a histogram (I'm learning ggplot) that displays the genotype frequency in each of my study groups. Something like this: Here's a mock data set that mirrors my own: df<-data

Concatenate multiple histograms in matplotlib

怎甘沉沦 提交于 2020-01-06 14:46:08
问题 I am dealing with a set of time series data. And I have separated this set into 108 windows by time(a single time window is 1-month long). And then I plot a histogram for each window. So I have 108 histograms, from time window1 to window108 they are in chronological order(from November 2006 to October 2015, 108 time windows). What I want to do: is to plot all these 108 histograms as a one horizontally long histogram. So it is like to add them altogether. For example, plot histogram for

How do I save histogram to file in matlab?

不打扰是莪最后的温柔 提交于 2020-01-06 13:02:01
问题 figure; histogram = hist(np,180); name=['histogram-' int2str(k) '.png']; %% k is the iterator so basically I want to save all the images using a loop. imwrite(out,name); The image I got is only a horizontal line. Does someone know how to fix this? 回答1: you can use savefig instead of imwrite here is the doc http://www.mathworks.ch/ch/help/matlab/ref/savefig.html savefig(h,filename) h is the handle of figure. you could skip h to save the current figure. (edit) savefig may not be there depending