How do I make bokeh omit missing dates when using datetime as x-axis

和自甴很熟 提交于 2019-11-27 02:58:17

问题


I am looking at the candlestick example in the bokeh docs, found here:

https://github.com/bokeh/bokeh/blob/master/examples/plotting/file/candlestick.py

and I am trying to figure out a good way to eliminate the "spaces" in the x-axis where there is no data.

Specifically, for financial data like MSFT used in the example, there is no data for weekends and holidays. Is there a way to tell bokeh not to leave an empty space in the chart when there is no data for a date?

Here is a paste of the example code found at the above link for convenience:

from math import pi
import pandas as pd

from bokeh.sampledata.stocks import MSFT
from bokeh.plotting import *

df = pd.DataFrame(MSFT)[:50]
df['date'] = pd.to_datetime(df['date'])

mids = (df.open + df.close)/2
spans = abs(df.close-df.open)

inc = df.close > df.open
dec = df.open > df.close
w = 12*60*60*1000 # half day in ms

output_file("candlestick.html", title="candlestick.py example")

figure(x_axis_type = "datetime", tools="pan,wheel_zoom,box_zoom,reset,previewsave",
   width=1000, name="candlestick")

hold()

segment(df.date, df.high, df.date, df.low, color='black')
rect(df.date[inc], mids[inc], w, spans[inc], fill_color="#D5E1DD", line_color="black")
rect(df.date[dec], mids[dec], w, spans[dec], fill_color="#F2583E", line_color="black")

curplot().title = "MSFT Candlestick"
xaxis().major_label_orientation = pi/4
grid().grid_line_alpha=0.3

show()  # open a browser

回答1:


UPDATE: As of Bokeh 0.12.6 you can specify overrides for major tick labels on axes.

import pandas as pd

from bokeh.io import show, output_file
from bokeh.plotting import figure
from bokeh.sampledata.stocks import MSFT

df = pd.DataFrame(MSFT)[:50]
inc = df.close > df.open
dec = df.open > df.close

p = figure(plot_width=1000, title="MSFT Candlestick with Custom X-Axis")

# map dataframe indices to date strings and use as label overrides
p.xaxis.major_label_overrides = {
    i: date.strftime('%b %d') for i, date in enumerate(pd.to_datetime(df["date"]))
}

# use the *indices* for x-axis coordinates, overrides will print better labels
p.segment(df.index, df.high, df.index, df.low, color="black")
p.vbar(df.index[inc], 0.5, df.open[inc], df.close[inc], fill_color="#D5E1DD", line_color="black")
p.vbar(df.index[dec], 0.5, df.open[dec], df.close[dec], fill_color="#F2583E", line_color="black")

output_file("custom_datetime_axis.html", title="custom_datetime_axis.py example")

show(p)

If you have a very large number of dates, this approach might become unwieldy, and a Custom Extension might become necessary.




回答2:


UPDATE 2016-05-26:

some details of the BokehJS interface have changed. For Bokeh 0.11 and newer, the __implementation__ should now be:

__implementation__ = """
    _ = require "underscore"
    Model = require "model"
    p = require "core/properties"

    class DateGapTickFormatter extends Model
      type: 'DateGapTickFormatter'

      doFormat: (ticks) ->
        date_labels = @get("date_labels")
        return (date_labels[tick] ? "" for tick in ticks)

      @define {
        date_labels: [ p.Any ]
      }

    module.exports =
      Model: DateGapTickFormatter
"""

This is not expected to change any further.

2016-02-09

Pull request 3314 was made for an example that works on 2015-12-05. The original code is here. The documentation for the candlestick example is still showing the same code as the OP had in the question.

Included below for reference.

from math import pi

import pandas as pd

from bokeh.sampledata.stocks import MSFT
from bokeh.plotting import figure, show, output_file
from bokeh.models.formatters import TickFormatter, String, List

# In this custom TickFormatter, xaxis labels are taken from an array of date
# Strings (e.g. ['Sep 01', 'Sep 02', ...]) passed to the date_labels property. 
class DateGapTickFormatter(TickFormatter):
    date_labels = List(String)

    __implementation__ = """
_ = require "underscore"
HasProperties = require "common/has_properties"

class DateGapTickFormatter extends HasProperties
  type: 'DateGapTickFormatter'

  format: (ticks) ->
    date_labels = @get("date_labels")
    return (date_labels[tick] ? "" for tick in ticks)

module.exports =
  Model: DateGapTickFormatter
"""

df = pd.DataFrame(MSFT)[:50]

# xaxis date labels used in the custom TickFormatter
date_labels = [date.strftime('%b %d') for date in pd.to_datetime(df["date"])]

mids = (df.open + df.close)/2
spans = abs(df.close-df.open)

inc = df.close > df.open
dec = df.open > df.close
w = 0.5

output_file("custom_datetime_axis.html", title="custom_datetime_axis.py example")

TOOLS = "pan,wheel_zoom,box_zoom,reset,save"

p = figure(tools=TOOLS, plot_width=1000, toolbar_location="left")

# Using the custom TickFormatter. You must always define date_labels
p.xaxis[0].formatter = DateGapTickFormatter(date_labels = date_labels)

# x coordinates must be integers. If for example df.index are 
# datetimes, you should replace them with a integer sequence
p.segment(df.index, df.high, df.index, df.low, color="black")
p.rect(df.index[inc], mids[inc], w, spans[inc], fill_color="#D5E1DD", line_color="black")
p.rect(df.index[dec], mids[dec], w, spans[dec], fill_color="#F2583E", line_color="black")

p.title = "MSFT Candlestick with custom x axis"
p.xaxis.major_label_orientation = pi/4

p.grid[0].ticker.desired_num_ticks = 6

show(p)  # open a browser

Due to the code using the dataframe index, your data must be sorted in ascending date order. If you have a time series in descending date order it can be reversed for use by the above code with:

df.sort_values(by='date', inplace=True)
df.reset_index(drop=True, inplace=True)


来源:https://stackoverflow.com/questions/23585545/how-do-i-make-bokeh-omit-missing-dates-when-using-datetime-as-x-axis

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!