x-Axis ticks as dates | 易学教程

问题

I have some data I would like to plot consisting of two columns, one being an amount count and the other column being the actually date recorded. When plotting this, since I have over 2000 dates, it makes more sense to not show every single date as a tick on the x-axis, otherwise it won't be readable. However, I am having a hard time making the dates show up on the x-axis with some kind of logic. I have tried using the in-built tick locators for matplotlib but it's not working somehow. Here is a preview of the data:

PatientTraffic = pd.DataFrame({'count' : CleanData.groupby("TimeStamp").size()}).reset_index()
display(PatientTraffic.head(3000))

TimeStamp   count
0   2016-03-13 12:20:00 1
1   2016-03-13 13:39:00 1
2   2016-03-13 13:43:00 1
3   2016-03-13 16:00:00 1
4   2016-03-14 13:27:00 1
... ... ...
2088    2020-02-18 16:00:00 8
2089    2020-02-19 16:00:00 8
2090    2020-02-20 16:00:00 8
2091    2020-02-21 16:00:00 8
2092    2020-02-22 16:00:00 8
2093 rows × 2 columns

and when I go to plot it with these settings:

PatientTrafficPerTimeStamp = PatientTraffic.plot.bar(
        x='TimeStamp', 
        y='count',
        figsize=(20,3),
        title = "Patient Traffic over Time"
        
    )
PatientTrafficPerTimeStamp.xaxis.set_major_locator(plt.MaxNLocator(3))

I expect to get a bar chart where the x-axis has three ticks, one in the beginning middle and end...maybe I'm using this wrong. Also, it seems like the ticks that appear are simply the first 3 in the column which is not what I want. Any help would be appreciated!

回答1:

You probably think that you have one problem, but you actually have two - and both are based on the fact that you use convenience functions. The problem that you are most likely not aware of is that pandas plots bars as categorical data. This makes sense under most conditions but obviously not, if you have TimeStamp data as your x-axis. Let's see if I didn't make that up:

import matplotlib.pyplot as plt
import pandas as pd

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 5))
df = pd.read_csv("test.txt", sep = "\s{2,}", engine="python")
#convert TS from string into datetime objects
df.TS = pd.to_datetime(df.TS, format="%Y-%m-%d %H:%M:%S")

#and plot it as you do directly from pandas that provides the data to matplotlib
df.plot.bar(
        x="TS", 
        y="Val",
        ax=ax1,
        title="pandas version"    
    )

#now plot the same data using matplotlib
ax2.bar(df.TS, df.Val, width=22)
ax2.tick_params(axis="x", labelrotation=90)
ax2.set_title("matplotlib version")    

plt.tight_layout()
plt.show()

Sample output:

So, we should plot them directly from matplotlib to prevent losing the TimeStamp information. Obviously, we lose some comfort provided by pandas, e.g., we have to adjust the width of the bars and label the axes. Now, you could use the other convenience function of MaxNLocatorbut as you noticed that has been written to work well for most conditions but you give up control over the exact positioning of the ticks. Why not write our own locator using FixedLocator?

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from matplotlib.ticker import FixedLocator
import pandas as pd

def myownMaxNLocator(datacol, n):
    datemin = mdates.date2num(datacol.min())
    datemax = mdates.date2num(datacol.max())
    xticks = np.linspace(datemin, datemax, n)
    return xticks


fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 5))
df = pd.read_csv("test.txt", sep = "\s{2,}", engine="python")
df.TS = pd.to_datetime(df.TS, format="%Y-%m-%d %H:%M:%S")
    
df.plot.bar(
        x="TS", 
        y="Val",
        ax=ax1,
        title="pandas version"    
    )

ax2.bar(df.TS, df.Val, width=22)
ax2.set_title("matplotlib version")
dateticks = myownMaxNLocator(df.TS, 5)
ax2.xaxis.set_major_locator(FixedLocator(dateticks))
ax2.tick_params(axis="x", labelrotation=90)

plt.tight_layout()
plt.show()

Sample output:

Here, the ticks start with the lowest value and end with the highest value. Alternatively, you could use the LinearLocator that distributes the ticks evenly over the entire view:

from matplotlib.ticker import LinearLocator
...
ax2.bar(df.TS, df.Val, width=22)
ax2.set_title("matplotlib version")
ax2.xaxis.set_major_locator(LinearLocator(numticks=5))
ax2.tick_params(axis="x", labelrotation=90)
...

Sample output:

The sample data were stored in a file with the following structure:

TS   Val
0   2016-03-13 12:20:00  1
1   2016-04-13 13:39:00  3
2   2016-04-03 13:43:00  5
3   2016-06-17 16:00:00  1
4   2016-09-14 13:27:00  2
2088    2017-02-08 16:00:00  7
2089    2017-02-25 16:00:00  2
2090    2018-02-20 16:00:00  8
2091    2019-02-21 16:00:00  9
2092    2020-02-22 16:00:00  8

回答2:

Have you considered grouping by date if you don't need that many xticks? Answering your question, you can make custom ticks with :

plt.xticks(ticks=[ any list ], labels=[ list of labels ])
link to documentation

来源：https://stackoverflow.com/questions/64952301/x-axis-ticks-as-dates

标签

python

matplotlib