How to apply custom column order (on Categorical) to pandas boxplot?

谁都会走 提交于 2019-12-01 03:21:41

Hard to say how to do this without a working example. My first guess would be to just add an integer column with the orders that you want.

A simple, brute-force way would be to add each boxplot one at a time.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame(np.random.rand(37,4), columns=list('ABCD'))
columns_my_order = ['C', 'A', 'D', 'B']
fig, ax = plt.subplots()
for position, column in enumerate(columns_my_order):
    ax.boxplot(df[column], positions=[position])

ax.set_xticks(range(position+1))
ax.set_xticklabels(columns_my_order)
ax.set_xlim(xmin=-0.5)
plt.show()

Actually I got stuck with the same question. And I solved it by making a map and reset the xticklabels, with code as follows:

df = pd.DataFrame({"A":["d","c","d","c",'d','c','a','c','a','c','a','c']})
df['val']=(np.random.rand(12))
df['B']=df['A'].replace({'d':'0','c':'1','a':'2'})
ax=df.boxplot(column='val',by='B')
ax.set_xticklabels(list('dca'))

Note that pandas can now create categorical columns. If you don't mind having all the columns present in your graph, or trimming them appropriately, you can do something like the below:

http://pandas.pydata.org/pandas-docs/stable/categorical.html

df['Category'] = df['Category'].astype('category', ordered=True)

Recent pandas also appears to allow positions to pass all the way through from frame to axes.

Cireo

EDIT: this is the right answer after direct support was added somewhere between version 0.15-0.18


Adding a separate answer, which perhaps could be another question - feedback appreciated.

I wanted to add a custom column order within a groupby, which posed many problems for me. In the end, I had to avoid trying to use boxplot from a groupby object, and instead go through each subplot myself to provide explicit positions.

import matplotlib.pyplot as plt
import pandas as pd

df = pd.DataFrame()
df['GroupBy'] = ['g1', 'g2', 'g3', 'g4'] * 6
df['PlotBy'] = [chr(ord('A') + i) for i in xrange(24)]
df['SortBy'] = list(reversed(range(24)))
df['Data'] = [i * 10 for i in xrange(24)]

# Note that this has no effect on the boxplot
df = df.sort_values(['GroupBy', 'SortBy'])
for group, info in df.groupby('GroupBy'):
    print 'Group: %r\n%s\n' % (group, info)

# With the below, cannot use
#  - sort data beforehand (not preserved, can't access in groupby)
#  - categorical (not all present in every chart)
#  - positional (different lengths and sort orders per group)
# df.groupby('GroupBy').boxplot(layout=(1, 5), column=['Data'], by=['PlotBy'])

fig, axes = plt.subplots(1, df.GroupBy.nunique(), sharey=True)
for ax, (g, d) in zip(axes, df.groupby('GroupBy')):
    d.boxplot(column=['Data'], by=['PlotBy'], ax=ax, positions=d.index.values)
plt.show()

Within my final code, it was even slightly more involved to determine positions because I had multiple data points for each sortby value, and I ended up having to do the below:

to_plot = data.sort_values([sort_col]).groupby(group_col)
for ax, (group, group_data) in zip(axes, to_plot):
    # Use existing sorting
    ordering = enumerate(group_data[sort_col].unique())
    positions = [ind for val, ind in sorted((v, i) for (i, v) in ordering)]
    ax = group_data.boxplot(column=[col], by=[plot_by], ax=ax, positions=positions)
Fernanda

It might sound kind of silly, but many of the plot allow you to determine the order. For example:

Library & dataset

import seaborn as sns
df = sns.load_dataset('iris')

Specific order

p1=sns.boxplot(x='species', y='sepal_length', data=df, order=["virginica", "versicolor", "setosa"])
sns.plt.show()
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!