Similarly to this question, I\'m interested in creating time series spirals. The solution doesn\'t necessarily have to be implemented in R or using ggplot, but it seems the
Still needs work, but it's a start, with python and matplotlib.
The idea is to plot a spiral timeline in polar coordinates with 1 week period, each event is an arc of this spiral with a color depending on dist
data.
There are lots of overlapping intervals though that this visualization tends to hide... maybe semitransparent arcs could be better, with a carefully chosen colormap.
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.patheffects as mpe
import pandas as pd
# styling
LINEWIDTH=4
EDGEWIDTH=1
CAPSTYLE="projecting"
COLORMAP="viridis_r"
ALPHA=1
FIRSTDAY=6 # 0=Mon, 6=Sun
# load dataset and parse timestamps
df = pd.read_csv('trips.csv')
df[['trip_start', 'trip_stop']] = df[['trip_start', 'trip_stop']].apply(pd.to_datetime)
# set origin at the first FIRSTDAY before the first trip, midnight
first_trip = df['trip_start'].min()
origin = (first_trip - pd.to_timedelta(first_trip.weekday() - FIRSTDAY, unit='d')).replace(hour=0, minute=0, second=0)
weekdays = pd.date_range(origin, origin + np.timedelta64(1, 'W')).strftime("%a").tolist()
# # convert trip timestamps to week fractions
df['start'] = (df['trip_start'] - origin) / np.timedelta64(1, 'W')
df['stop'] = (df['trip_stop'] - origin) / np.timedelta64(1, 'W')
# sort dataset so shortest trips are plotted last
# should prevent longer events to cover shorter ones, still suboptimal
df = df.sort_values('dist', ascending=False).reset_index()
fig = plt.figure(figsize=(8, 6))
ax = fig.gca(projection="polar")
for idx, event in df.iterrows():
# sample normalized distance from colormap
ndist = event['dist'] / df['dist'].max()
color = plt.cm.get_cmap(COLORMAP)(ndist)
tstart, tstop = event.loc[['start', 'stop']]
# timestamps are in week fractions, 2pi is one week
nsamples = int(1000. * (tstop - tstart))
t = np.linspace(tstart, tstop, nsamples)
theta = 2 * np.pi * t
arc, = ax.plot(theta, t, lw=LINEWIDTH, color=color, solid_capstyle=CAPSTYLE, alpha=ALPHA)
if EDGEWIDTH > 0:
arc.set_path_effects([mpe.Stroke(linewidth=LINEWIDTH+EDGEWIDTH, foreground='black'), mpe.Normal()])
# grid and labels
ax.set_rticks([])
ax.set_theta_zero_location("N")
ax.set_theta_direction(-1)
ax.set_xticks(np.linspace(0, 2*np.pi, 7, endpoint=False))
ax.set_xticklabels(weekdays)
ax.tick_params('x', pad=2)
ax.grid(True)
# setup a custom colorbar, everything's always a bit tricky with mpl colorbars
vmin = df['dist'].min()
vmax = df['dist'].max()
norm = mpl.colors.Normalize(vmin=vmin, vmax=vmax)
sm = plt.cm.ScalarMappable(cmap=COLORMAP, norm=norm)
sm.set_array([])
plt.colorbar(sm, ticks=np.linspace(vmin, vmax, 10), fraction=0.04, aspect=60, pad=0.1, label="distance", ax=ax)
plt.savefig("spiral.png", pad_inches=0, bbox_inches="tight")
To see it's a spiral that never overlaps and it works for longer events too you can plot the full timeline (here with LINEWIDTH=3.5
to limit moiré fringing).
fullt = np.linspace(df['start'].min(), df['stop'].max(), 10000)
theta = 2 * np.pi * fullt
ax.plot(theta, fullt, lw=LINEWIDTH,
path_effects=[mpe.Stroke(linewidth=LINEWIDTH+LINEBORDER, foreground='black'), mpe.Normal()])
Here's the plot for a random dataset of 200
mainly short trips with the occasional 1 to 2 weeks long ones.
N = 200
df = pd.DataFrame()
df["start"] = np.random.uniform(0, 20, size=N)
df["stop"] = df["start"] + np.random.choice([np.random.uniform(0, 0.1),
np.random.uniform(1., 2.)], p=[0.98, 0.02], size=N)
df["dist"] = np.random.random(size=N)
inferno_r
color map, rounded or butted linecaps, semitransparent, bolder edges, etc (click for full size)
Here's a start. Let me know if this is what you had in mind.
I began with your data sample and put trip_start
and trip_stop
into POSIXct
format before continuing with the code below.
library(tidyverse)
library(lubridate)
dat = dat %>%
mutate(start=(hour(trip_start)*60 + minute(trip_start) + second(trip_start))/(24*60) + wday(trip_start),
stop=(hour(trip_stop)*60 + minute(trip_stop) + second(trip_stop))/(24*60) + wday(trip_stop),
tod = case_when(hour(trip_start) < 6 ~ "night",
hour(trip_start) < 12 ~ "morning",
hour(trip_start) < 18 ~ "afternoon",
hour(trip_start) < 24 ~ "evening"))
ggplot(dat) +
geom_segment(aes(x=start, xend=stop,
y=trip_start,
yend=trip_stop,
colour=tod),
size=5, show.legend = FALSE) +
coord_polar() +
scale_y_datetime(breaks=seq(as.POSIXct("2017-09-01"), as.POSIXct("2017-12-31"), by="week")) +
scale_x_continuous(limits=c(1,8), breaks=1:7,
labels=weekdays(x=as.Date(seq(7)+2, origin="1970-01-01"),
abbreviate=TRUE))+
expand_limits(y=as.POSIXct("2017-08-25")) +
theme_bw() +
scale_colour_manual(values=c(night="black", morning="orange",
afternoon="orange", evening="blue")) +
labs(x="",y="")
This could be achieved relatively straightforwardly with d3. I'll use your data to create a rough template of one basic possible approach. Here's what the result of this approach might look like:
The key ingredient is d3's radial line component that lets us define a line by plotting angle and radius (here's a recent answer showing another spiral graph, that answer started me down the path on this answer).
All we need to do is scale angle and radius to be able to use this effectively (for which we need the first time and last time in the dataset):
var angle = d3.scaleTime()
.domain([start,end])
.range([0,Math.PI * 2 * numberWeeks])
var radius = d3.scaleTime()
.domain([start,end])
.range([minInnerRadius,maxOuterRadius])
And from there we can create a spiral quite easily, we sample some dates throughout the interval and then pass them to the radial line function:
var spiral = d3.radialLine()
.curve(d3.curveCardinal)
.angle(angle)
.radius(radius);
Here's a quick demonstration of just the spiral covering your time period. I'm assuming a base familiarity with d3 for this answer, so have not touched on a few parts of the code.
Once we have that, it's just a matter of adding sections from the data. The most simple way would be to plainly draw a stroke with some width and color it appropriately. This requires the same as above, but rather than sampling points from the start and end times of the dataset, we just need the start and end times of each datum:
// append segments on spiral:
var segments = g.selectAll()
.data(data)
.enter()
.append("path")
.attr("d", function(d) {
return /* sample points and feed to spiral function here */;
})
.style("stroke-width", /* appropriate width here */ )
.style("stroke",function(d) { return /* color logic here */ })
This might look something like this (with data mouseover).
This is just a proof of concept, if you were looking for more control and a nicer look, you could create a polygonal path for each data entry and use both fill & stroke. As is, you'll have to make do with layering strokes to get borders if desired and svg manipulations like line capping options.
Also, as it's d3, and longer timespans may be hard to show all at once, you could show less time but rotate the spiral so that it animates through your time span, dropping off events at the end and creating them in the origin. The actual chart might need to be canvas for this to happen smoothly depending on number of nodes, but to convert to canvas is relatively trivial in this case.
For the sake of filling out the visualization a little with a legend and day labels, this is what I have.