Pandas bar plot with binned range

前端 未结 3 1412
佛祖请我去吃肉
佛祖请我去吃肉 2020-12-08 03:24

Is there a way to create a bar plot from continuous data binned into predefined intervals? For example,

In[1]: df
Out[1]: 
0      0.729630
1      0.699620
2         


        
相关标签:
3条回答
  • 2020-12-08 03:50

    You may consider using matplotlib to plot the histogram. Unlike pandas' hist function, matplotlib.pyplot.hist accepts an array as input for the bins.

    import numpy as np; np.random.seed(0)
    import matplotlib.pyplot as plt
    import pandas as pd
    
    x = np.random.rand(120)
    df = pd.DataFrame({"x":x})
    
    bins= [0,0.35,0.7,1]
    plt.hist(df.values, bins=bins, edgecolor="k")
    plt.xticks(bins)
    
    plt.show()
    

    0 讨论(0)
  • 2020-12-08 03:59

    You can use pd.cut

    bins = [0,0.35,0.7,1]
    df = df.groupby(pd.cut(df['val'], bins=bins)).val.count()
    df.plot(kind='bar')
    

    0 讨论(0)
  • 2020-12-08 04:00

    You can make use of pd.cut to partition the values into bins corresponding to each interval and then take each interval's total counts using pd.value_counts. Plot a bar graph later, additionally replace the X-axis tick labels with the category name to which that particular tick belongs.

    out = pd.cut(s, bins=[0, 0.35, 0.7, 1], include_lowest=True)
    ax = out.value_counts(sort=False).plot.bar(rot=0, color="b", figsize=(6,4))
    ax.set_xticklabels([c[1:-1].replace(","," to") for c in out.cat.categories])
    plt.show()
    


    If you want the Y-axis to be displayed as relative percentages, normalize the frequency counts and multiply that result with 100.

    out = pd.cut(s, bins=[0, 0.35, 0.7, 1], include_lowest=True)
    out_norm = out.value_counts(sort=False, normalize=True).mul(100)
    ax = out_norm.plot.bar(rot=0, color="b", figsize=(6,4))
    ax.set_xticklabels([c[1:-1].replace(","," to") for c in out.cat.categories])
    plt.ylabel("pct")
    plt.show()
    

    0 讨论(0)
提交回复
热议问题