问题
Dateutil is a great tool for parsing dates in string format. for example
from dateutil.parser import parse
parse("Tue, 01 Oct 2013 14:26:00 -0300")
returns
datetime.datetime(2013, 10, 1, 14, 26, tzinfo=tzoffset(None, -10800))
however,
parse("Ter, 01 Out 2013 14:26:00 -0300") # In portuguese
yields this error:
ValueError: unknown string format
Does anybody know how to make dateutil aware of the locale?
回答1:
As far as I can see, dateutil is not locale aware (yet!).
I can think of three alternative suggestions:
The day and month names are hardcoded in
dateutil.parser
(as part of theparserinfo
class). You could subclass parserinfo, and replace these names with the appropriate names for Portuguese.Modify dateutil to get day and month names based on the user’s locale. So you could do something like
import locale locale.setlocale(locale.LC_ALL, "pt_PT") from dateutil.parser import parse parse("Ter, 01 Out 2013 14:26:00 -0300")
I’ve started a fork which gets the names from the
calendar
module (which is locale-aware) to work on this: https://github.com/alexwlchan/dateutilRight now it works for Portuguese (or seems to), but I want to think about it a bit more before I submit a patch to the main branch. In particular, weirdness may happen if it faces characters which aren’t used in Western European languages. I haven’t tested this yet. (See https://stackoverflow.com/a/8917539/1558022)
If you’re not tied to the dateutil module, you could use datetime instead, which is already locale-aware:
from datetime import datetime, date import locale locale.setlocale(locale.LC_ALL, "pt_PT") datetime.strptime("Ter, 01 Out 2013 14:26:00 -0300", "%a, %d %b %Y %H:%M:%S %z")
(Note that the
%z
token is not consistently supported in datetime.)
回答2:
You could use PyICU to parse a localized date/time string in a given format:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from datetime import datetime
import icu # PyICU
df = icu.SimpleDateFormat(
'EEE, dd MMM yyyy HH:mm:ss zzz', icu.Locale('pt_BR'))
ts = df.parse(u'Ter, 01 Out 2013 14:26:00 -0300')
print(datetime.utcfromtimestamp(ts))
# -> 2013-10-01 17:26:00 (UTC)
It works on Python 2/3. It does not modify global state (locale).
If your actual input time string does not contain the explicit utc offset then you should specify a timezone to be used by ICU explicitly otherwise you can get a wrong result (ICU and datetime may use different timezone definitions).
If you only need to support Python 3 and you don't mind setting the locale then you could use datetime.strptime()
as @alexwlchan suggested:
#!/usr/bin/env python3
import locale
from datetime import datetime
locale.setlocale(locale.LC_TIME, "pt_PT.UTF-8")
print(datetime.strptime("Ter, 01 Out 2013 14:26:00 -0300",
"%a, %d %b %Y %H:%M:%S %z")) # works on Python 3.2+
# -> 2013-10-01 14:26:00-03:00
回答3:
from dateutil.parser import parse
parse("Ter, 01 Out 2013 14:26:00 -0300",fuzzy=True)
Result:
datetime.datetime(2013, 1, 28, 14, 26, tzinfo=tzoffset(None, -10800))
来源:https://stackoverflow.com/questions/19927654/using-dateutil-parser-to-parse-a-date-in-another-language