Python boxplot out of columns of different lengths

ぃ、小莉子 提交于 2019-12-22 05:09:09

问题


I have the following dataframe in Python (the actual dataframe is much bigger, just presenting a small sample):

      A     B     C     D     E     F
0  0.43  0.52  0.96  1.17  1.17  2.85
1  0.43  0.52  1.17  2.72  2.75  2.94
2  0.43  0.53  1.48  2.85  2.83  
3  0.47  0.59  1.58        3.14  
4  0.49  0.80        

I convert the dataframe to numpy using df.values and then pass that to boxplot.

When I try to make a boxplot out of this pandas dataframe, the number of values picked from each column is restricted to the least number of values in a column (in this case, column F). Is there any way I can boxplot all values from each column?

NOTE: I use df.dropna to drop the rows in each column with missing values. However, this is resizing the dataframe to the lowest common denominator of column length, and messing up the plotting.

import prettyplotlib as ppl
import numpy as np
import pandas
import matplotlib as mpl
from matplotlib import pyplot

df = pandas.DataFrame.from_csv(csv_data,index_col=False)
df = df.dropna()
labels = ['A', 'B', 'C', 'D', 'E', 'F']
fig, ax = pyplot.subplots()
ppl.boxplot(ax, df.values, xticklabels=labels)
pyplot.show()

回答1:


The right way to do it, saving from reinventing the wheel, would be to use the .boxplot() in pandas, where the nan handled correctly:

In [31]:

print df
      A     B     C     D     E     F
0  0.43  0.52  0.96  1.17  1.17  2.85
1  0.43  0.52  1.17  2.72  2.75  2.94
2  0.43  0.53  1.48  2.85  2.83   NaN
3  0.47  0.59  1.58   NaN  3.14   NaN
4  0.49  0.80   NaN   NaN   NaN   NaN

[5 rows x 6 columns]
In [32]:

_=plt.boxplot(df.values)
_=plt.xticks(range(1,7),labels)
plt.savefig('1.png') #keeping the nan's and plot by plt

In [33]:

_=df.boxplot()
plt.savefig('2.png') #keeping the nan's and plot by pandas

In [34]:

_=plt.boxplot(df.dropna().values)
_=plt.xticks(range(1,7),labels)
plt.savefig('3.png') #dropping the nan's and plot by plt



来源:https://stackoverflow.com/questions/23144071/python-boxplot-out-of-columns-of-different-lengths

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!