问题
I'm having a bit of trouble manipulating a histogram. I have a df with two columns and I'm plotting them as a stacked histogram. I'm putting them into specific bins (see code below) but I want to make one large bin at the end (4000-10000). However, by default the column width of the large bin is huge.. Is there a way to make this large bin not larger in size? For all of the columns to be the same width even if their x-range is uneven?
Code:
df.plot.hist(stacked=True, bins=[0,400,800,1200,1600,2000,2400,2800,3200,3600,4000,10000],normed= True)
Thank you!!!
EDIT:
Per advice, trying to give an example dataset. Crude but maybe it will help illustrate the problem..
df = pd.DataFrame(np.random.randint(0,4000,size=(100, 2)), columns=['A','B'])
df['A'].loc[85:89] = np.random.randint(5000,10000, size=5)
df.plot.hist(stacked=True, bins=[0,400,800,1200,1600,2000,2400,2800,3200,3600,4000,10000],normed= True)
回答1:
Make all bins the same size, then clip your data to the right end of the last bin.
df = pd.DataFrame(np.random.randint(0,4000,size=(100, 2)), columns=['A','B'])
df['A'].loc[85:89] = np.random.randint(5000,10000, size=5)
bins = [0,400,800,1200,1600,2000,2400,2800,3200,3600,4000,4400]
df.clip(upper=4400).plot.hist(stacked=True, bins=bins, normed=True)
Take into account that, as pointed in the comments, this is not really a histogram. You might want to customize the labels to reflect the fact that the last bin is actually larger than it looks.
来源:https://stackoverflow.com/questions/41080028/how-to-make-the-width-of-histogram-columns-all-the-same