问题
I have two datasets that contain temperature and light sensor readings. The measurements were done from 22:35:41 - 04:49:41.
The problem with this datasets is to plot the measurements with respect to the datetime.date format when the measurements are taken from one day to another (22:35:41 - 04:49:41). The plot-function automatically starts from 00:00 and puts the data that was measured before 00:00 to the end of the plot.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
Temperature = pd.read_excel("/kaggle/input/Temperature_measurement.xlsx")
Light = pd.read_excel("/kaggle/input/Light_measurement.xlsx")
sns.lineplot(x="Time",y="Light", data = Light)
sns.lineplot(y="Temperature", x="Time", data = Temperature)
plt.show()
This is a link to the dataset
Here is a link to the Jupyter Notebook
回答1:
First you need to convert your times to a Pandas Timestamp. Pandas Timestamps don't really support a time on its own, they will attach a date to them, but that's fine since we'll hide that part later.
We also need to detect day changes, which we can do by looking at where the time wraps, which we can find by looking at a time that's smaller than its predecessor.
We can count the cumulative wraps and add that number of dates to our timestamps.
Let's define a function to take the datetime.time objects, convert them to native Pandas Timestamps (using an arbitrary date of 1900-01-01, which is the default for Pandas) and adjusting the day according to the wraps (so we end up with our final times on 1900-01-02):
def normalize_time(series):
series = pd.to_datetime(series, format="%H:%M:%S")
series += pd.to_timedelta(series.lt(series.shift()).cumsum(), unit="D")
return series
Let's now apply it to our DataFrames:
Light["Time"] = normalize_time(Light["Time"])
Temperature["Time"] = normalize_time(Temperature["Time"])
Plotting the data now will look correct, with the times being continuous. Except that the labels of the X ticks will try to display the dates, which are not really what we care about, so let's fix that part now.
We can use Matplotlib's set_major_formatter together with a DateFormatter to include times only:
import matplotlib.dates
ax = plt.subplot()
sns.lineplot(x="Time", y="Light", data=Light)
sns.lineplot(x="Time", y="Temperature", data=Temperature)
ax.xaxis.set_major_formatter(
matplotlib.dates.DateFormatter("%H:%M")
)
plt.show()
This produces X ticks every hour, which seem to be a great fit for this data set.
来源:https://stackoverflow.com/questions/60318410/plot-time-series-with-different-timestamps-and-datetime-time-format-that-goes-ov