Creating a temporal range time-series spiral plot

后端 未结 3 602
猫巷女王i
猫巷女王i 2020-12-29 09:33

Similarly to this question, I\'m interested in creating time series spirals. The solution doesn\'t necessarily have to be implemented in R or using ggplot, but it seems the

相关标签:
3条回答
  • 2020-12-29 09:49

    Still needs work, but it's a start, with python and matplotlib.

    The idea is to plot a spiral timeline in polar coordinates with 1 week period, each event is an arc of this spiral with a color depending on dist data.

    There are lots of overlapping intervals though that this visualization tends to hide... maybe semitransparent arcs could be better, with a carefully chosen colormap.

    import numpy as np
    import matplotlib as mpl
    import matplotlib.pyplot as plt
    import matplotlib.patheffects as mpe
    import pandas as pd
    
    # styling
    LINEWIDTH=4
    EDGEWIDTH=1
    CAPSTYLE="projecting"
    COLORMAP="viridis_r"
    ALPHA=1
    FIRSTDAY=6 # 0=Mon, 6=Sun
    
    # load dataset and parse timestamps
    df = pd.read_csv('trips.csv')
    df[['trip_start', 'trip_stop']] = df[['trip_start', 'trip_stop']].apply(pd.to_datetime)
    
    # set origin at the first FIRSTDAY before the first trip, midnight
    first_trip = df['trip_start'].min()
    origin = (first_trip - pd.to_timedelta(first_trip.weekday() - FIRSTDAY, unit='d')).replace(hour=0, minute=0, second=0)
    weekdays = pd.date_range(origin, origin + np.timedelta64(1, 'W')).strftime("%a").tolist()
    
    # # convert trip timestamps to week fractions
    df['start'] = (df['trip_start'] - origin) / np.timedelta64(1, 'W')
    df['stop']  = (df['trip_stop']  - origin) / np.timedelta64(1, 'W')
    
    # sort dataset so shortest trips are plotted last
    # should prevent longer events to cover shorter ones, still suboptimal
    df = df.sort_values('dist', ascending=False).reset_index()
    
    fig = plt.figure(figsize=(8, 6))
    ax = fig.gca(projection="polar")
    
    for idx, event in df.iterrows():
        # sample normalized distance from colormap
        ndist = event['dist'] / df['dist'].max()
        color = plt.cm.get_cmap(COLORMAP)(ndist)
        tstart, tstop = event.loc[['start', 'stop']]
        # timestamps are in week fractions, 2pi is one week
        nsamples = int(1000. * (tstop - tstart))
        t = np.linspace(tstart, tstop, nsamples)
        theta = 2 * np.pi * t
        arc, = ax.plot(theta, t, lw=LINEWIDTH, color=color, solid_capstyle=CAPSTYLE, alpha=ALPHA)
        if EDGEWIDTH > 0:
            arc.set_path_effects([mpe.Stroke(linewidth=LINEWIDTH+EDGEWIDTH, foreground='black'), mpe.Normal()])
    
    # grid and labels
    ax.set_rticks([])
    ax.set_theta_zero_location("N")
    ax.set_theta_direction(-1)
    ax.set_xticks(np.linspace(0, 2*np.pi, 7, endpoint=False))
    ax.set_xticklabels(weekdays)
    ax.tick_params('x', pad=2)
    ax.grid(True)
    # setup a custom colorbar, everything's always a bit tricky with mpl colorbars
    vmin = df['dist'].min()
    vmax = df['dist'].max()
    norm = mpl.colors.Normalize(vmin=vmin, vmax=vmax)
    sm = plt.cm.ScalarMappable(cmap=COLORMAP, norm=norm)
    sm.set_array([])
    plt.colorbar(sm, ticks=np.linspace(vmin, vmax, 10), fraction=0.04, aspect=60, pad=0.1, label="distance", ax=ax)
    
    plt.savefig("spiral.png", pad_inches=0, bbox_inches="tight")
    

    Full timeline

    To see it's a spiral that never overlaps and it works for longer events too you can plot the full timeline (here with LINEWIDTH=3.5 to limit moiré fringing).

    fullt = np.linspace(df['start'].min(), df['stop'].max(), 10000)
    theta = 2 * np.pi * fullt
    ax.plot(theta, fullt, lw=LINEWIDTH,
            path_effects=[mpe.Stroke(linewidth=LINEWIDTH+LINEBORDER, foreground='black'), mpe.Normal()])
    

    Example with a random set...

    Here's the plot for a random dataset of 200 mainly short trips with the occasional 1 to 2 weeks long ones.

    N = 200
    df = pd.DataFrame()
    df["start"] = np.random.uniform(0, 20, size=N)
    df["stop"] = df["start"] + np.random.choice([np.random.uniform(0, 0.1),
                                                 np.random.uniform(1., 2.)], p=[0.98, 0.02], size=N)
    df["dist"] = np.random.random(size=N)
    

    ... and different styles

    inferno_r color map, rounded or butted linecaps, semitransparent, bolder edges, etc (click for full size)

    0 讨论(0)
  • 2020-12-29 09:57

    Here's a start. Let me know if this is what you had in mind.

    I began with your data sample and put trip_start and trip_stop into POSIXct format before continuing with the code below.

    library(tidyverse)
    library(lubridate)
    
    dat = dat %>% 
      mutate(start=(hour(trip_start)*60 + minute(trip_start) + second(trip_start))/(24*60) + wday(trip_start),
             stop=(hour(trip_stop)*60 + minute(trip_stop) + second(trip_stop))/(24*60) + wday(trip_stop),
             tod = case_when(hour(trip_start) < 6 ~ "night",
                             hour(trip_start) < 12 ~ "morning",
                             hour(trip_start) < 18 ~ "afternoon",
                             hour(trip_start) < 24 ~ "evening"))
    
    ggplot(dat) +
      geom_segment(aes(x=start, xend=stop, 
                       y=trip_start, 
                       yend=trip_stop, 
                       colour=tod), 
                   size=5, show.legend = FALSE) +
      coord_polar() +
      scale_y_datetime(breaks=seq(as.POSIXct("2017-09-01"), as.POSIXct("2017-12-31"), by="week")) +
      scale_x_continuous(limits=c(1,8), breaks=1:7, 
                         labels=weekdays(x=as.Date(seq(7)+2, origin="1970-01-01"), 
                                         abbreviate=TRUE))+
      expand_limits(y=as.POSIXct("2017-08-25")) +
      theme_bw() +
      scale_colour_manual(values=c(night="black", morning="orange",
                                   afternoon="orange", evening="blue")) +
      labs(x="",y="")
    

    0 讨论(0)
  • 2020-12-29 10:07

    This could be achieved relatively straightforwardly with d3. I'll use your data to create a rough template of one basic possible approach. Here's what the result of this approach might look like:

    The key ingredient is d3's radial line component that lets us define a line by plotting angle and radius (here's a recent answer showing another spiral graph, that answer started me down the path on this answer).

    All we need to do is scale angle and radius to be able to use this effectively (for which we need the first time and last time in the dataset):

    var angle = d3.scaleTime()
      .domain([start,end])
      .range([0,Math.PI * 2 * numberWeeks])
    
    var radius = d3.scaleTime()
      .domain([start,end])
      .range([minInnerRadius,maxOuterRadius])
    

    And from there we can create a spiral quite easily, we sample some dates throughout the interval and then pass them to the radial line function:

    var spiral = d3.radialLine()
        .curve(d3.curveCardinal)
        .angle(angle)
        .radius(radius);
    

    Here's a quick demonstration of just the spiral covering your time period. I'm assuming a base familiarity with d3 for this answer, so have not touched on a few parts of the code.

    Once we have that, it's just a matter of adding sections from the data. The most simple way would be to plainly draw a stroke with some width and color it appropriately. This requires the same as above, but rather than sampling points from the start and end times of the dataset, we just need the start and end times of each datum:

        // append segments on spiral:  
        var segments = g.selectAll()
          .data(data)
          .enter()
          .append("path")
          .attr("d", function(d) {
            return /* sample points and feed to spiral function here */;
          })
          .style("stroke-width", /* appropriate width here */ )
          .style("stroke",function(d) { return /* color logic here */ })
    

    This might look something like this (with data mouseover).

    This is just a proof of concept, if you were looking for more control and a nicer look, you could create a polygonal path for each data entry and use both fill & stroke. As is, you'll have to make do with layering strokes to get borders if desired and svg manipulations like line capping options.

    Also, as it's d3, and longer timespans may be hard to show all at once, you could show less time but rotate the spiral so that it animates through your time span, dropping off events at the end and creating them in the origin. The actual chart might need to be canvas for this to happen smoothly depending on number of nodes, but to convert to canvas is relatively trivial in this case.


    For the sake of filling out the visualization a little with a legend and day labels, this is what I have.

    0 讨论(0)
提交回复
热议问题