pandas - histogram from two columns?

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-20 03:23:05

问题


I have this data:

data = pd.DataFrame().from_dict([r for r in response])
print data

     _id  total
0    213      1
1    194      3
2    205      156
...

Now, if I call:

data.hist()

I will get two separate histograms, one for each column. This is not what I want. What I want is a single histogram made using those two columns, where one column is interpreted as a value and another one as a number of occurrences of this value. What should I do to generate such a histogram?

I tried:

data.hist(column="_id", by="total")

But this generates even more (empty) histograms with error message.


回答1:


You can always drop to the lower-level matplotlib.hist:

from matplotlib.pyplot import hist
df = pd.DataFrame({
    '_id': np.random.randn(100),
    'total': 100 * np.random.rand()
})
hist(df._id, weights=df.total)




回答2:


Since you already have the bin frequencies computed (the total column), just use pandas.DataFrame.plot

data.plot( x='_id', y='total', kind='hist')


来源:https://stackoverflow.com/questions/31571830/pandas-histogram-from-two-columns

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!