问题
I have a dataset of traffic violations and want to display only the top 10 violations per month on a bargraph. Can I limit the number of bars after sorting values to display only the top 10? There are 42 different column names of traffic violations.
month_jan = df[df.MonthName == "Jan"]
month_jan[feature_cols].sum().sort_values(ascending=0).plot(kind='bar')
Feature_cols
is a list of all 42 column names that correspond to traffic violations.
Thanks!
回答1:
This will work:
month_jan[feature_cols].sum().sort_values(ascending=0)[:10].plot(kind='bar')
回答2:
Series
objects have a .head
method, just like DataFrame
s (docs).
This allows you to select the top N items very elegantly with data.head(N)
.
Here's a complete working example:
import pandas as pd
df = pd.DataFrame({
'feature1': [0, 1, 2, 3],
'feature2': [2, 3, 4, 5],
'MonthName': ['Jan', 'Jan', 'Jan', 'Feb']
})
feature_cols = ['feature1', 'feature2']
month_jan = df[df.MonthName == "Jan"]
top10 = month_jan[feature_cols].sum().sort_values(ascending=0).head(10)
top10.plot(kind='bar')
来源:https://stackoverflow.com/questions/38338396/sort-and-limit-number-of-bars-to-display-on-bargraph