问题
I would like to create a histogram with a density plot combined in bokeh with a slider filter. Atm, I have the blocks to create a bokeh histogram with a density plot from another thread. I dont know how to create the callback function to update the data and rerender the plot.
from bokeh.io import output_file, show
from bokeh.plotting import figure
from bokeh.sampledata.autompg import autompg as df
from numpy import histogram, linspace
from scipy.stats.kde import gaussian_kde
pdf = gaussian_kde(df.hp)
x = linspace(0,250,50)
p = figure(plot_height=300)
p.line(x, pdf(x))
# plot actual hist for comparison
hist, edges = histogram(df.hp, density=True, bins=20)
p.quad(top=hist, bottom=0, left=edges[:-1], right=edges[1:], alpha=0.4)
show(p)
回答1:
There are two ways to implement callbacks in Bokeh:
- with JS code. In that case, the plot remains a standalone object, the constraint being you need to do any data manipulation within Javascript (there is a small caveat to that statement but not relevant here:
scipy
can't be called from such a callback) - by having the callback executed in Bokeh server, in which case you have the full arsenal of python available to you. The cost being, there's a bit more to plotting and distributing the graph than in the first case (but it's not difficult, see example).
Considering you need to refit the kde each time you change the filter condition, the second way is the only option (unless you want to do that in javascript...).
That's how you would do it (example with a filter on cyl
):
from bokeh.application import Application
from bokeh.application.handlers import FunctionHandler
from bokeh.io import output_notebook, show
from bokeh.layouts import column
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource, Select
from bokeh.sampledata.autompg import autompg as df
from numpy import histogram, linspace
from scipy.stats.kde import gaussian_kde
output_notebook()
def modify_doc(doc):
x = linspace(0,250,50)
source_hist = ColumnDataSource({'top': [], 'left': [], 'right': []})
source_kde = ColumnDataSource({'x': [], 'y': []})
p = figure(plot_height=300)
p.line(x='x', y='y', source=source_kde)
p.quad(top='top', bottom=0, left='left', right='right', alpha=0.4, source=source_hist)
def update(attr, old, new):
if new == 'All':
filtered_df = df
else:
condition = df.cyl == int(new)
filtered_df = df[condition]
hist, edges = histogram(filtered_df.hp, density=True, bins=20)
pdf = gaussian_kde(filtered_df.hp)
source_hist.data = {'top': hist, 'left': edges[:-1], 'right': edges[1:]}
source_kde.data = {'x': x, 'y': pdf(x)}
update(None, None, 'All')
select = Select(title='# cyl', value='All', options=['All'] + [str(i) for i in df.cyl.unique()])
select.on_change('value', update)
doc.add_root(column(select, p))
# To run it in the notebook:
plot = Application(FunctionHandler(modify_doc))
show(plot)
# Or to run it stand-alone with `bokeh serve --show myapp.py`
# in which case you need to remove the `output_notebook()` call
# from bokeh.io import curdoc
# modify_doc(curdoc())
A few notes:
- this is made to be run in jupyter notebook (see the
output_notebook()
and the last uncommented two lines). - to run it outside, comment the notebook lines (see above) and uncomment the last two lines. Then you can run it from the command line.
Select
will only handlestr
values so you need to convert in (when creating it) and out (when using the values:old
andnew
)- for multiple filters, you need to access the state of each
Select
at the same time. You do that by instantiating theSelect
s before defining theupdate
function (but without any callbacks, yet!) and keeping a reference to them, access their value withyour_ref.value
and build your condition with that. After theupdate
definition, you can then attach the callback on eachSelect
.
Finally, an example with multiple selects:
def modify_doc(doc):
x = linspace(0,250,50)
source_hist = ColumnDataSource({'top': [], 'left': [], 'right': []})
source_kde = ColumnDataSource({'x': [], 'y': []})
p = figure(plot_height=300)
p.line(x='x', y='y', source=source_kde)
p.quad(top='top', bottom=0, left='left', right='right', alpha=0.4, source=source_hist)
select_cyl = Select(title='# cyl', value='All', options=['All'] + [str(i) for i in df.cyl.unique()])
select_ori = Select(title='origin', value='All', options=['All'] + [str(i) for i in df.origin.unique()])
def update(attr, old, new):
all = pd.Series(True, index=df.index)
if select_cyl.value == 'All':
cond_cyl = all
else:
cond_cyl = df.cyl == int(select_cyl.value)
if select_ori.value == 'All':
cond_ori = all
else:
cond_ori = df.origin == int(select_ori.value)
filtered_df = df[cond_cyl & cond_ori]
hist, edges = histogram(filtered_df.hp, density=True, bins=20)
pdf = gaussian_kde(filtered_df.hp)
source_hist.data = {'top': hist, 'left': edges[:-1], 'right': edges[1:]}
source_kde.data = {'x': x, 'y': pdf(x)}
update(None, None, 'All')
select_ori.on_change('value', update)
select_cyl.on_change('value', update)
doc.add_root(column(select_ori, select_cyl, p))
来源:https://stackoverflow.com/questions/46644197/histogram-with-slider-filter