Matplotlib.pyplot.hist() very slow

我只是一个虾纸丫 提交于 2020-05-12 12:21:46

问题


I'm plotting about 10,000 items in an array. They are of around 1,000 unique values.

The plotting has been running half an hour now. I made sure rest of the code works.

Is it that slow? This is my first time plotting histograms with pyplot.


回答1:


To plot histograms using matplotlib quickly you need to pass the histtype='step' argument to pyplot.hist. For example:

plt.hist(np.random.exponential(size=1000000,bins=10000))
plt.show()

takes ~15 seconds to draw and roughly 5-10 seconds to update when you pan or zoom.

In contrast, plotting with histtype='step':

plt.hist(np.random.exponential(size=1000000),bins=10000,histtype='step')
plt.show()

plots almost immediately and can be panned and zoomed with no delay.




回答2:


Importing seaborn somewhere in the code may cause pyplot.hist to take a really long time.

If the problem is seaborn, it can be solved by resetting the matplotlib settings:

import seaborn as sns
sns.reset_orig()



回答3:


It will be instant to plot the histogram after flattening the numpy array. Try the below demo code:

import numpy as np

array2d = np.random.random_sample((512,512))*100
plt.hist(array2d.flatten())
plt.hist(array2d.flatten(), bins=1000)



回答4:


For me, the problem is that the data type of pd.series, say S, is 'object' rather than 'float64'. After I use S = np.float64(S), then plt.hist(S) is very quick!!




回答5:


For me it took calling figure.canvas.draw() after the call to hist to update immediately, i.e. hist was actually fast (discovered that after timing it), but there was a delay of a few seconds before figure was updated. I was calling hist inside a matplotlib callback in a jupyter lab cell (qt5 backend).




回答6:


Anyone running into the issue I had - (which is totally my bad :) )

If you're dealing with numbers, make sure when reading from CSV that your datatype is int/float, and not string.

values_arr = .... .flatten().astype('float')



回答7:


If you are working with pandas, make sure the data you passed in plt.hist() is a 1-d series rather than a dataframe. This helped me out.



来源:https://stackoverflow.com/questions/35738199/matplotlib-pyplot-hist-very-slow

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!