I have a string with variable length and I want to give a format to strptime
in order for the rest of the string to be ignored. Let me exemplify. I have something like
9/4/2013,00:00:00,7.8,7.4,9.53
10/4/2013,00:00:00,8.64,7.4,9.53
and I want a format that makes the command strptime(line,format)
work to read those lines. Something like format='%d/%m/%Y,%H:%M:%S*'
, although I know that doesn't work. I guess my question is kind of similar to this one, but no answer there could help me and my problem is a little worse because the full length of my string can vary. I have a feeling that dateutil
could solve my problem, but I can't find something there that does the trick.
I can probably do something like strptime(''.join(line.split(',')[:2]),format)
, but I wouldn't want to resort to that for user-related issues.
You cannot have datetime.strptime()
ignore part of the input.; your only option really is to split off the extra text first.
So yes, you do have to split and rejoin your string:
format = '%d/%m/%Y,%H:%M:%S'
datetime.strptime(','.join(line.split(',', 2)[:2]), format)
or find some other means to extract the information. You could use a regular expression, for example:
datetime_pattern = re.compile(r'(\d{1,2}/\d{1,2}/\d{4},\d{2}:\d{2}:\d{2})')
format = '%d/%m/%Y,%H:%M:%S'
datetime.strptime(datetime_pattern.search(line).group(), format)
To build a format string without splitting the time string and discarding extra text, just include the extra text in the format string. t[t.index(',',t.index(',') + 1):]
is extra text.
from datetime import datetime
l = ['9/4/2013,00:00:00,7.8,7.4,9.53', '10/4/2013,00:00:00,8.64,7.4,9.53']
for t in l:
print datetime.strptime(t,'%d/%m/%Y,%H:%M:%S'+t[t.index(',',t.index(',')+1):])
If the string has '%' can be replaced by empty string.
l = ['9/4/2013,00:00:00,7.8,7.4,9.53', '10/4/2013,00:00:00,8.64,7.4,9.53']
for t in l:
t = t.replace('%','')
fmt = '%d/%m/%Y,%H:%M:%S' + t[t.index(',',t.index(',')+1):]
print datetime.strptime(t, fmt)
Or with string slicing and static format string,
for t in l:
print datetime.strptime(t[:t.find(',',t.find(',')+1)],'%d/%m/%Y,%H:%M:%S')
2013-04-09 00:00:00
2013-04-10 00:00:00
Have a look at datetime-glob, a module we developed to parse date/times from a list of files. You can use datetime_glob.PatternSegment
to parse arbitrary strings:
>>> import datetime_glob
>>> patseg = datetime_glob.parse_pattern_segment('%-d/%-m/%Y,%H:%M:%S*')
>>> match = datetime_glob.match_segment('9/4/2013,01:02:03,7.8,7.4,9.53',
patseg)
>>> match.as_datetime()
datetime.datetime(2013, 4, 9, 1, 2, 3)
Using regexp too because python datetime
does not allow to ignore char, this version use no-capturing group (sorry the example is not related to your question):
import datetime, re
date_re = re.compile(r'([^.]+)(?:\.[0-9]+) (\+[0-9]+)')
date_str = "2018-09-06 04:15:18.334232115 +0000"
date_str = " ".join(date_re.search(date_str).groups())
date_obj = datetime.datetime.strptime(date_str, "%Y-%m-%d %H:%M:%S %z")
It's much better to use regexp like @marjin suggests, so your code is more comprehensible and easy to update.
来源:https://stackoverflow.com/questions/29284850/datetime-strptime-set-format-to-ignore-trailing-part-of-string