可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
My dataframe has uneven time index.
how could I find a way to plot the data, and local the index automatically? I searched here, and I know I can plot something like
e.plot()

but the time index (x axis) will be even interval, for example per 5 minutes. if I have to 100 data in first 5 minutes and 6 data for the second 5 minutes, how do I plot with number of data evenly. and locate the right timestamp on x axis.
here's even count, but I don't know how to add time index.
plot(e['Bid'].values)

example of data format as requested
Time,Bid
2014-03-05 21:56:05:924300,1.37275
2014-03-05 21:56:05:924351,1.37272
2014-03-05 21:56:06:421906,1.37275
2014-03-05 21:56:06:421950,1.37272
2014-03-05 21:56:06:920539,1.37275
2014-03-05 21:56:06:920580,1.37272
2014-03-05 21:56:09:071981,1.37275
2014-03-05 21:56:09:072019,1.37272
and here's the link http://code.google.com/p/eu-ats/source/browse/trunk/data/new/eur-fix.csv
here's the code, I used to plot
import numpy as np import pandas as pd import datetime as dt e = pd.read_csv("data/ecb/eur.csv", dtype={'Time':object}) e.Time = pd.to_datetime(e.Time, format='%Y-%m-%d %H:%M:%S:%f') e.plot() f = e.copy() f.index = f.Time x = [str(s)[:-7] for s in f.index] ff = f.set_index(pd.Series(x)) ff.index.name = 'Time' ff.plot()
Update:
I added two new plots for comparison to clarify the issue. Now I tried brute force to convert timestamp index back to string, and plot string as x axis. the format easily got messed up. it seems hard to customize location of x label.


回答1:
Ok, it seems like what you're after is that you want to move around the x-tick locations so that there are an equal number of points between each tick. And you'd like to have the grid drawn on these appropriately-located ticks. Do I have that right?
If so:
import pandas as pd import urllib import matplotlib.pyplot as plt import seaborn as sbn content = urllib.urlopen('https://eu-ats.googlecode.com/svn/trunk/data/new/eur-fix.csv') df = pd.read_csv(content, header=0) df['Time'] = pd.to_datetime(df['Time'], format='%Y-%m-%d %H:%M:%S:%f') every30 = df.loc[df.index % 30 == 0, 'Time'].values fig, ax = plt.subplots(1, 1, figsize=(9, 5)) df.plot(x='Time', y='Bid', ax=ax) ax.set_xticks(every30)

回答2:
I have tried to reproduce your issue, but I can't seem to. Can you have a look at this example and see how your situation differs?
import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sbn np.random.seed(0) idx = pd.date_range('11:00', '21:30', freq='1min') ser = pd.Series(data=np.random.randn(len(idx)), index=idx) ser = ser.cumsum() for i in range(20): for j in range(8): ser.iloc[10*i +j] = np.nan fig, axes = plt.subplots(1, 2, figsize=(10, 5)) ser.plot(ax=axes[0]) ser.dropna().plot(ax=axes[1])
gives the following two plots:

There are a couple differences between the graphs. The one on the left doesn't connect the non-continuous bits of data. And it lacks vertical gridlines. But both seem to respect the actual index of the data. Can you show an example of your e
series? What is the exact format of its index? Is it a datetime_index
or is it just text?
Edit:
Playing with this, my guess is that your index is actually just text. If I continue from above with:
idx_str = [str(x) for x in idx] newser = ser newser.index = idx_str fig, axes = plt.subplots(1, 2, figsize=(10, 5)) newser.plot(ax=axes[0]) newser.dropna().plot(ax=axes[1])
then I get something like your problem:

More edit:
If this is in fact your issue (the index is a bunch of strings, not really a bunch of timestamps) then you can convert them and all will be well:
idx_fixed = pd.to_datetime(idx_str) fixedser = newser fixedser.index = idx_fixed fig, axes = plt.subplots(1, 2, figsize=(10, 5)) fixedser.plot(ax=axes[0]) fixedser.dropna().plot(ax=axes[1])
produces output identical to the first code sample above.
Editing again:
To see the uneven spacing of the data, you can do this:
fig, axes = plt.subplots(1, 2, figsize=(10, 5)) fixedser.plot(ax=axes[0], marker='.', linewidth=0) fixedser.dropna().plot(ax=axes[1], marker='.', linewidth=0)

回答3:
Let me try this one from scratch. Does this solve your issue?
import pandas as pd import matplotlib.pyplot as plt import seaborn as sbn import urllib content = urllib.urlopen('https://eu-ats.googlecode.com/svn/trunk/data/new/eur-fix.csv') df = pd.read_csv(content, header=0, index_col='Time') df.index = pd.to_datetime(df.index, format='%Y-%m-%d %H:%M:%S:%f') df.plot()

The thing is, you want to plot bid
vs time
. If you've put the times into your index
then they become your x-axis for "free". If the time data is just another column, then you need to specify that you want to plot bid
as the y-axis variable and time
as the x-axis variable. So in your code above, even when you convert the time
data to be datetime
type, you were never instructing pandas
/matplotlib
to use those datetimes
as the x-axis.