问题
I read a csv file containing 150,000 lines into a pandas dataframe. This dataframe has a field, \'Date\', with the dates in yyyy-mm-dd format. I want to extract the month, day and year from it and copy into the dataframes\' columns, \'Month\', \'Day\' and \'Year\' respectively. For a few hundred records the below two methods work ok, but for 150,000 records both take a ridiculously long time to execute. Is there a faster way to do this for 100,000+ records?
First method:
df = pandas.read_csv(filename)
for i in xrange(len(df)):
df.loc[i,\'Day\'] = int(df.loc[i,\'Date\'].split(\'-\')[2])
Second method:
df = pandas.read_csv(filename)
for i in xrange(len(df)):
df.loc[i,\'Day\'] = datetime.strptime(df.loc[i,\'Date\'], \'%Y-%m-%d\').day
Thank you.
回答1:
In 0.15.0 you will be able to use the new .dt accessor to do this nice syntactically.
In [36]: df = DataFrame(date_range('20000101',periods=150000,freq='H'),columns=['Date'])
In [37]: df.head(5)
Out[37]:
Date
0 2000-01-01 00:00:00
1 2000-01-01 01:00:00
2 2000-01-01 02:00:00
3 2000-01-01 03:00:00
4 2000-01-01 04:00:00
[5 rows x 1 columns]
In [38]: %timeit f(df)
10 loops, best of 3: 22 ms per loop
In [39]: def f(df):
df = df.copy()
df['Year'] = DatetimeIndex(df['Date']).year
df['Month'] = DatetimeIndex(df['Date']).month
df['Day'] = DatetimeIndex(df['Date']).day
return df
....:
In [40]: f(df).head()
Out[40]:
Date Year Month Day
0 2000-01-01 00:00:00 2000 1 1
1 2000-01-01 01:00:00 2000 1 1
2 2000-01-01 02:00:00 2000 1 1
3 2000-01-01 03:00:00 2000 1 1
4 2000-01-01 04:00:00 2000 1 1
[5 rows x 4 columns]
From 0.15.0 on (release in end of Sept 2014), the following is now possible with the new .dt accessor:
df['Year'] = df['Date'].dt.year
df['Month'] = df['Date'].dt.month
df['Day'] = df['Date'].dt.day
回答2:
I use below code which works very well for me
df['Year']=[d.split('-')[0] for d in df.Date]
df['Month']=[d.split('-')[1] for d in df.Date]
df['Day']=[d.split('-')[2] for d in df.Date]
df.head(5)
来源:https://stackoverflow.com/questions/21954197/which-is-the-fastest-way-to-extract-day-month-and-year-from-a-given-date