datetime strptime - set format to ignore trailing part of string

倖福魔咒の 提交于 2019-11-28 07:36:49

问题


I have a string with variable length and I want to give a format to strptime in order for the rest of the string to be ignored. Let me exemplify. I have something like

9/4/2013,00:00:00,7.8,7.4,9.53
10/4/2013,00:00:00,8.64,7.4,9.53

and I want a format that makes the command strptime(line,format) work to read those lines. Something like format='%d/%m/%Y,%H:%M:%S*', although I know that doesn't work. I guess my question is kind of similar to this one, but no answer there could help me and my problem is a little worse because the full length of my string can vary. I have a feeling that dateutil could solve my problem, but I can't find something there that does the trick.

I can probably do something like strptime(''.join(line.split(',')[:2]),format), but I wouldn't want to resort to that for user-related issues.


回答1:


You cannot have datetime.strptime() ignore part of the input.; your only option really is to split off the extra text first.

So yes, you do have to split and rejoin your string:

format = '%d/%m/%Y,%H:%M:%S'
datetime.strptime(','.join(line.split(',', 2)[:2]), format)

or find some other means to extract the information. You could use a regular expression, for example:

datetime_pattern = re.compile(r'(\d{1,2}/\d{1,2}/\d{4},\d{2}:\d{2}:\d{2})')
format = '%d/%m/%Y,%H:%M:%S'
datetime.strptime(datetime_pattern.search(line).group(), format)



回答2:


To build a format string without splitting the time string and discarding extra text, just include the extra text in the format string. t[t.index(',',t.index(',') + 1):] is extra text.

from datetime import datetime
l = ['9/4/2013,00:00:00,7.8,7.4,9.53', '10/4/2013,00:00:00,8.64,7.4,9.53']
for t in l:
    print datetime.strptime(t,'%d/%m/%Y,%H:%M:%S'+t[t.index(',',t.index(',')+1):])

If the string has '%' can be replaced by empty string.

l = ['9/4/2013,00:00:00,7.8,7.4,9.53', '10/4/2013,00:00:00,8.64,7.4,9.53']
for t in l:
    t = t.replace('%','')
    fmt = '%d/%m/%Y,%H:%M:%S' + t[t.index(',',t.index(',')+1):]
    print datetime.strptime(t, fmt)

Or with string slicing and static format string,

for t in l:
        print datetime.strptime(t[:t.find(',',t.find(',')+1)],'%d/%m/%Y,%H:%M:%S')

2013-04-09 00:00:00
2013-04-10 00:00:00




回答3:


Have a look at datetime-glob, a module we developed to parse date/times from a list of files. You can use datetime_glob.PatternSegment to parse arbitrary strings:

>>> import datetime_glob
>>> patseg = datetime_glob.parse_pattern_segment('%-d/%-m/%Y,%H:%M:%S*')
>>> match = datetime_glob.match_segment('9/4/2013,01:02:03,7.8,7.4,9.53',
                                        patseg)
>>> match.as_datetime()
datetime.datetime(2013, 4, 9, 1, 2, 3)



回答4:


Using regexp too because python datetime does not allow to ignore char, this version use no-capturing group (sorry the example is not related to your question):

import datetime, re

date_re = re.compile(r'([^.]+)(?:\.[0-9]+) (\+[0-9]+)')
date_str = "2018-09-06 04:15:18.334232115 +0000"

date_str = " ".join(date_re.search(date_str).groups())

date_obj = datetime.datetime.strptime(date_str, "%Y-%m-%d %H:%M:%S %z")

It's much better to use regexp like @marjin suggests, so your code is more comprehensible and easy to update.



来源:https://stackoverflow.com/questions/29284850/datetime-strptime-set-format-to-ignore-trailing-part-of-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!