Matplotlib and Numpy - Create a calendar heatmap

后端 未结 4 810
旧时难觅i
旧时难觅i 2020-12-24 04:09

Is it possible to create a calendar heatmap without using pandas? If so, can someone post a simple example?

I have dates like Aug-16 and a count value like 16 and I

相关标签:
4条回答
  • 2020-12-24 04:28

    Edit: I now see the question asks for a plot without pandas. Even so, this question is a first page Google result for "python calendar heatmap", so I will leave this here. I recommend using pandas anyway. You probably already have it as a dependency of another package, and pandas has by far the best APIs for working with datetime data (pandas.Timestamp and pandas.DatetimeIndex).

    The only Python package that I can find for these plots is calmap which is unmaintained and incompatible with recent matplotlib. So I decided to write my own. It produces plots like the following:

    Here is the code. The input is a series with a datetime index giving the values for the heatmap:

    import numpy as np
    import pandas as pd
    import matplotlib as mpl
    import matplotlib.pyplot as plt
    
    
    DAYS = ['Sun.', 'Mon.', 'Tues.', 'Wed.', 'Thurs.', 'Fri.', 'Sat.']
    MONTHS = ['Jan.', 'Feb.', 'Mar.', 'Apr.', 'May', 'June', 'July', 'Aug.', 'Sept.', 'Oct.', 'Nov.', 'Dec.']
    
    
    def date_heatmap(series, start=None, end=None, mean=False, ax=None, **kwargs):
        '''Plot a calendar heatmap given a datetime series.
    
        Arguments:
            series (pd.Series):
                A series of numeric values with a datetime index. Values occurring
                on the same day are combined by sum.
            start (Any):
                The first day to be considered in the plot. The value can be
                anything accepted by :func:`pandas.to_datetime`. The default is the
                earliest date in the data.
            end (Any):
                The last day to be considered in the plot. The value can be
                anything accepted by :func:`pandas.to_datetime`. The default is the
                latest date in the data.
            mean (bool):
                Combine values occurring on the same day by mean instead of sum.
            ax (matplotlib.Axes or None):
                The axes on which to draw the heatmap. The default is the current
                axes in the :module:`~matplotlib.pyplot` API.
            **kwargs:
                Forwarded to :meth:`~matplotlib.Axes.pcolormesh` for drawing the
                heatmap.
    
        Returns:
            matplotlib.collections.Axes:
                The axes on which the heatmap was drawn. This is set as the current
                axes in the `~matplotlib.pyplot` API.
        '''
        # Combine values occurring on the same day.
        dates = series.index.floor('D')
        group = series.groupby(dates)
        series = group.mean() if mean else group.sum()
    
        # Parse start/end, defaulting to the min/max of the index.
        start = pd.to_datetime(start or series.index.min())
        end = pd.to_datetime(end or series.index.max())
    
        # We use [start, end) as a half-open interval below.
        end += np.timedelta64(1, 'D')
    
        # Get the previous/following Sunday to start/end.
        # Pandas and numpy day-of-week conventions are Monday=0 and Sunday=6.
        start_sun = start - np.timedelta64((start.dayofweek + 1) % 7, 'D')
        end_sun = end + np.timedelta64(7 - end.dayofweek - 1, 'D')
    
        # Create the heatmap and track ticks.
        num_weeks = (end_sun - start_sun).days // 7
        heatmap = np.zeros((7, num_weeks))
        ticks = {}  # week number -> month name
        for week in range(num_weeks):
            for day in range(7):
                date = start_sun + np.timedelta64(7 * week + day, 'D')
                if date.day == 1:
                    ticks[week] = MONTHS[date.month - 1]
                if date.dayofyear == 1:
                    ticks[week] += f'\n{date.year}'
                if start <= date < end:
                    heatmap[day, week] = series.get(date, 0)
    
        # Get the coordinates, offset by 0.5 to align the ticks.
        y = np.arange(8) - 0.5
        x = np.arange(num_weeks + 1) - 0.5
    
        # Plot the heatmap. Prefer pcolormesh over imshow so that the figure can be
        # vectorized when saved to a compatible format. We must invert the axis for
        # pcolormesh, but not for imshow, so that it reads top-bottom, left-right.
        ax = ax or plt.gca()
        mesh = ax.pcolormesh(x, y, heatmap, **kwargs)
        ax.invert_yaxis()
    
        # Set the ticks.
        ax.set_xticks(list(ticks.keys()))
        ax.set_xticklabels(list(ticks.values()))
        ax.set_yticks(np.arange(7))
        ax.set_yticklabels(DAYS)
    
        # Set the current image and axes in the pyplot API.
        plt.sca(ax)
        plt.sci(mesh)
    
        return ax
    
    
    def date_heatmap_demo():
        '''An example for `date_heatmap`.
    
        Most of the sizes here are chosen arbitrarily to look nice with 1yr of
        data. You may need to fiddle with the numbers to look right on other data.
        '''
        # Get some data, a series of values with datetime index.
        data = np.random.randint(5, size=365)
        data = pd.Series(data)
        data.index = pd.date_range(start='2017-01-01', end='2017-12-31', freq='1D')
    
        # Create the figure. For the aspect ratio, one year is 7 days by 53 weeks.
        # We widen it further to account for the tick labels and color bar.
        figsize = plt.figaspect(7 / 56)
        fig = plt.figure(figsize=figsize)
    
        # Plot the heatmap with a color bar.
        ax = date_heatmap(data, edgecolor='black')
        plt.colorbar(ticks=range(5), pad=0.02)
    
        # Use a discrete color map with 5 colors (the data ranges from 0 to 4).
        # Extending the color limits by 0.5 aligns the ticks in the color bar.
        cmap = mpl.cm.get_cmap('Blues', 5)
        plt.set_cmap(cmap)
        plt.clim(-0.5, 4.5)
    
        # Force the cells to be square. If this is set, the size of the color bar
        # may look weird compared to the size of the heatmap. That can be corrected
        # by the aspect ratio of the figure or scale of the color bar.
        ax.set_aspect('equal')
    
        # Save to a file. For embedding in a LaTeX doc, consider the PDF backend.
        # http://sbillaudelle.de/2015/02/23/seamlessly-embedding-matplotlib-output-into-latex.html
        fig.savefig('heatmap.pdf', bbox_inches='tight')
    
        # The firgure must be explicitly closed if it was not shown.
        plt.close(fig)
    
    0 讨论(0)
  • 2020-12-24 04:33

    I was looking to create a calendar heatmap where each month is displayed separately. I also needed to annotate each day with the day number (day_of_month) and it's value label.

    I've been inspired by the answers posted here and also the following sites:

    Here, although in R

    Heatmap using pcolormesh

    However I didn't seem to find something exactly as I was looking for, so I've decided to post my solution here to perhaps save others wanting the same kind of plot some time.

    My example uses a bit of Pandas simply to generate some dummy data, so you can easily plug your own data source instead. Other than that it's just matplotlib.

    Output from the code is given below. For my needs I also wanted to highlight days where the data was 0 (see 1st January).

    import numpy as np
    import pandas as pd
    import matplotlib.pyplot as plt
    from matplotlib.patches import Polygon
    
    # Settings
    years = [2018] # [2018, 2019, 2020]
    weeks = [1, 2, 3, 4, 5, 6]
    days = ['M', 'T', 'W', 'T', 'F', 'S', 'S']
    month_names = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August',
                   'September', 'October', 'November', 'December']
    
    def generate_data():
        idx = pd.date_range('2018-01-01', periods=365, freq='D')
        return pd.Series(range(len(idx)), index=idx)
    
    
    def split_months(df, year):
        """
        Take a df, slice by year, and produce a list of months,
        where each month is a 2D array in the shape of the calendar
        :param df: dataframe or series
        :return: matrix for daily values and numerals
        """
        df = df[df.index.year == year]
    
    
        # Empty matrices
        a = np.empty((6, 7))
        a[:] = np.nan
    
        day_nums = {m:np.copy(a) for m in range(1,13)}  # matrix for day numbers
        day_vals = {m:np.copy(a) for m in range(1,13)}  # matrix for day values
    
        # Logic to shape datetimes to matrices in calendar layout
        for d in df.iteritems():  # use iterrows if you have a DataFrame
    
            day = d[0].day
            month = d[0].month
            col = d[0].dayofweek
    
            if d[0].is_month_start:
                row = 0
    
            day_nums[month][row, col] = day  # day number (0-31)
            day_vals[month][row, col] = d[1] # day value (the heatmap data)
    
            if col == 6:
                row += 1
    
        return day_nums, day_vals
    
    
    def create_year_calendar(day_nums, day_vals):
        fig, ax = plt.subplots(3, 4, figsize=(14.85, 10.5))
    
        for i, axs in enumerate(ax.flat):
    
            axs.imshow(day_vals[i+1], cmap='viridis', vmin=1, vmax=365)  # heatmap
            axs.set_title(month_names[i])
    
            # Labels
            axs.set_xticks(np.arange(len(days)))
            axs.set_xticklabels(days, fontsize=10, fontweight='bold', color='#555555')
            axs.set_yticklabels([])
    
            # Tick marks
            axs.tick_params(axis=u'both', which=u'both', length=0)  # remove tick marks
            axs.xaxis.tick_top()
    
            # Modify tick locations for proper grid placement
            axs.set_xticks(np.arange(-.5, 6, 1), minor=True)
            axs.set_yticks(np.arange(-.5, 5, 1), minor=True)
            axs.grid(which='minor', color='w', linestyle='-', linewidth=2.1)
    
            # Despine
            for edge in ['left', 'right', 'bottom', 'top']:
                axs.spines[edge].set_color('#FFFFFF')
    
            # Annotate
            for w in range(len(weeks)):
                for d in range(len(days)):
                    day_val = day_vals[i+1][w, d]
                    day_num = day_nums[i+1][w, d]
    
                    # Value label
                    axs.text(d, w+0.3, f"{day_val:0.0f}",
                             ha="center", va="center",
                             fontsize=7, color="w", alpha=0.8)
    
                    # If value is 0, draw a grey patch
                    if day_val == 0:
                        patch_coords = ((d - 0.5, w - 0.5),
                                        (d - 0.5, w + 0.5),
                                        (d + 0.5, w + 0.5),
                                        (d + 0.5, w - 0.5))
    
                        square = Polygon(patch_coords, fc='#DDDDDD')
                        axs.add_artist(square)
    
                    # If day number is a valid calendar day, add an annotation
                    if not np.isnan(day_num):
                        axs.text(d+0.45, w-0.31, f"{day_num:0.0f}",
                                 ha="right", va="center",
                                 fontsize=6, color="#003333", alpha=0.8)  # day
    
                    # Aesthetic background for calendar day number
                    patch_coords = ((d-0.1, w-0.5),
                                    (d+0.5, w-0.5),
                                    (d+0.5, w+0.1))
    
                    triangle = Polygon(patch_coords, fc='w', alpha=0.7)
                    axs.add_artist(triangle)
    
        # Final adjustments
        fig.suptitle('Calendar', fontsize=16)
        plt.subplots_adjust(left=0.04, right=0.96, top=0.88, bottom=0.04)
    
        # Save to file
        plt.savefig('calendar_example.pdf')
    
    
    for year in years:
        df = generate_data()
        day_nums, day_vals = split_months(df, year)
        create_year_calendar(day_nums, day_vals)
    

    There is probably a lot of room for optimisation, but this gets what I need done.

    0 讨论(0)
  • 2020-12-24 04:36

    It's certainly possible, but you'll need to jump through a few hoops.

    First off, I'm going to assume you mean a calendar display that looks like a calendar, as opposed to a more linear format (a linear formatted "heatmap" is much easier than this).

    The key is reshaping your arbitrary-length 1D series into an Nx7 2D array where each row is a week and columns are days. That's easy enough, but you also need to properly label months and days, which can get a touch verbose.

    Here's an example. It doesn't even remotely try to handle crossing across year boundaries (e.g. Dec 2014 to Jan 2015, etc). However, hopefully it gets you started:

    import datetime as dt
    import matplotlib.pyplot as plt
    import numpy as np
    
    def main():
        dates, data = generate_data()
        fig, ax = plt.subplots(figsize=(6, 10))
        calendar_heatmap(ax, dates, data)
        plt.show()
    
    def generate_data():
        num = 100
        data = np.random.randint(0, 20, num)
        start = dt.datetime(2015, 3, 13)
        dates = [start + dt.timedelta(days=i) for i in range(num)]
        return dates, data
    
    def calendar_array(dates, data):
        i, j = zip(*[d.isocalendar()[1:] for d in dates])
        i = np.array(i) - min(i)
        j = np.array(j) - 1
        ni = max(i) + 1
    
        calendar = np.nan * np.zeros((ni, 7))
        calendar[i, j] = data
        return i, j, calendar
    
    
    def calendar_heatmap(ax, dates, data):
        i, j, calendar = calendar_array(dates, data)
        im = ax.imshow(calendar, interpolation='none', cmap='summer')
        label_days(ax, dates, i, j, calendar)
        label_months(ax, dates, i, j, calendar)
        ax.figure.colorbar(im)
    
    def label_days(ax, dates, i, j, calendar):
        ni, nj = calendar.shape
        day_of_month = np.nan * np.zeros((ni, 7))
        day_of_month[i, j] = [d.day for d in dates]
    
        for (i, j), day in np.ndenumerate(day_of_month):
            if np.isfinite(day):
                ax.text(j, i, int(day), ha='center', va='center')
    
        ax.set(xticks=np.arange(7), 
               xticklabels=['M', 'T', 'W', 'R', 'F', 'S', 'S'])
        ax.xaxis.tick_top()
    
    def label_months(ax, dates, i, j, calendar):
        month_labels = np.array(['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul',
                                 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
        months = np.array([d.month for d in dates])
        uniq_months = sorted(set(months))
        yticks = [i[months == m].mean() for m in uniq_months]
        labels = [month_labels[m - 1] for m in uniq_months]
        ax.set(yticks=yticks)
        ax.set_yticklabels(labels, rotation=90)
    
    main()
    

    0 讨论(0)
  • 2020-12-24 04:44

    Below is a code that can be used to generate a calendar map for daily profiles of a value.

    """
    Created on Tue Sep  4 11:17:25 2018
    
    @author: woldekidank
    """
    
    import numpy as np
    from datetime import date
    import datetime
    import matplotlib.pyplot as plt
    import random
    
    
    D = date(2016,1,1)
    Dord = date.toordinal(D)
    Dweekday = date.weekday(D)
    
    Dsnday = Dord - Dweekday + 1 #find sunday
    square = np.array([[0, 0],[ 0, 1], [1, 1], [1, 0], [0, 0]])#x and y to draw a square
    row = 1
    count = 0
    while row != 0:
        for column in range(1,7+1):    #one week per row
            prof = np.ones([24, 1])
            hourly = np.zeros([24, 1])
            for i in range(1,24+1):
                prof[i-1, 0] = prof[i-1, 0] * random.uniform(0, 1)
                hourly[i-1, 0] = i / 24
            plt.title('Temperature Profile')
            plt.plot(square[:, 0] + column - 1, square[:, 1] - row + 1,color='r')    #go right each column, go down each row
            if date.fromordinal(Dsnday).month == D.month:
                if count == 0:
                    plt.plot(hourly, prof)
                else:
                    plt.plot(hourly + min(square[:, 0] + column - 1), prof + min(square[:, 1] - row + 1))
    
                plt.text(column - 0.5, 1.8 - row, datetime.datetime.strptime(str(date.fromordinal(Dsnday)),'%Y-%m-%d').strftime('%a'))
                plt.text(column - 0.5, 1.5 - row, date.fromordinal(Dsnday).day)
    
            Dsnday = Dsnday + 1
            count = count + 1
    
        if date.fromordinal(Dsnday).month == D.month:
            row = row + 1    #new row
        else:
            row = 0    #stop the while loop
    

    Below is the output from this code

    Image of data series on calendar days

    0 讨论(0)
提交回复
热议问题