I\'ve got a couple strings from which I want to get the datetime. They are formatted like this:
Thu 2nd May 2013 19:00
I know almost how I
Consider using dateutil.parser.parse.
It's a third party library that has a powerful parser which can handle these kinds of things.
from dateutil.parser import parse
s = 'Thu 2nd May 2013 19:00'
d = parse(s)
print(d, type(d))
# 2013-05-02 19:00:00 <class 'datetime.datetime'>
A brief caveat (doesn't really occur in your case): if dateutil can't find an aspect of your date in the string (say you leave out the month) then it will default to the default argument. This defaults to the current date with the time 00:00:00. You can obviously over-write this if necessary with a different datetime object.
The easiest way to install dateutil is probably using pip with the command pip install python-dateutil.
You can preparse the original string to adjust the day to be suitable for your strptime, eg:
from datetime import datetime
import re
s = 'Thu 2nd May 2013 19:00'
amended = re.sub('\d+(st|nd|rd|th)', lambda m: m.group()[:-2].zfill(2), s)
# Thu 02 May 2013 19:00
dt = datetime.strptime(amended, '%a %d %B %Y %H:%M')
# 2013-05-02 19:00:00
It's straightforward to remove the suffix from the date without using regular expressions or an external library.
def remove_date_suffix(s):
parts = s.split()
parts[1] = parts[1].strip("stndrh") # remove 'st', 'nd', 'rd', ...
return " ".join(parts)
Then it's as simple as using strptime as you'd expect:
>>> s = "Thu 2nd May 2013 19:00"
>>> remove_date_suffix(s)
'Thu 2 May 2013 19:00'
>>> datetime.strptime(remove_date_suffix(s), '%a %d %B %Y %H:%M')
datetime.datetime(2013, 5, 2, 19, 0)
import re
from datetime import datetime
def proc_date(x):
return re.sub(r"\b([0123]?[0-9])(st|th|nd|rd)\b",r"\1",x)
>>> x='Thu 2nd May 2013 19:00'
>>> proc_date(x)
'Thu 2 May 2013 19:00'
>>> datetime.strptime(proc_date(x), '%a %d %B %Y %H:%M')
datetime.datetime(2013, 5, 2, 19, 0)