Parsing an ISO 8601 date in Java8 when all fields (including separators, but not including years) are optional

耗尽温柔 提交于 2020-01-21 09:07:28

问题


I have a requirement for parsing ISO8601 formatted strings in Java with various levels of accuracy. Some examples of the string I need to parse are:

  • 2018
  • 2018-10
  • 2018-10-15
  • 2018-10-15T12:00
  • 2018-10-15T12:00:30
  • 2018-10-15T12:00:30.123
  • 20181015
  • 201810151200
  • 20181015120030
  • 20181015120030.123
  • 20181015T12:00:30.123

Where I don't have a field then I am free to assume the lowest valid value that applies (for example, if the month is missing I can assume January, if the day is missing then assume the first of the month and if the time is missing assume midnight)

I've searched SO and all the examples I've found all assume that I know the exact format in advance.


回答1:


Well that took me longer than I had expected. The only valid parser is:

DateTimeFormatter dtf = new DateTimeFormatterBuilder()
        .appendValue(ChronoField.YEAR, 4)
        .appendPattern("[['-']MM[['-']dd[['T']HH[[':']mm[[':']ss['.'SSS]]]]]]")
        .parseDefaulting(ChronoField.MONTH_OF_YEAR, 1)
        .parseDefaulting(ChronoField.DAY_OF_MONTH, 1)
        .parseDefaulting(ChronoField.HOUR_OF_DAY, 0)
        .parseDefaulting(ChronoField.MINUTE_OF_HOUR, 0)
        .parseDefaulting(ChronoField.SECOND_OF_MINUTE, 0)
        .parseDefaulting(ChronoField.NANO_OF_SECOND, 0)
        .toFormatter();

String[] s = {
        "2018",
        "2018-10",
        "2018-10-15",
        "2018-10-15T12:00",
        "2018-10-15T12:00:30",
        "2018-10-15T12:00:30.123",
        "20181015",
        "201810151200",
        "20181015120030",
        "20181015120030.123",
        "20181015T12:00:30.123"
};
for (String line : s) {
  System.out.println(LocalDateTime.parse(line, dtf));
}

The problem is that yyyy creates a ValueParser(minWidth=4, maxWidth=19, SignStyle.PAD_EXEEDS) which parses the date 20181015 as year=20181015 as an example. So we have to restrict the digit width of year to 4.

The documentation states:

Year: The count of letters determines the minimum field width below which padding is used.

But does not specify a maximum width.




回答2:


For the first cases with separators (-, :) one can use:

    DateTimeFormatter dtf = DateTimeFormatter
        .ofPattern("uuuu[-MM[-dd[['T']HH[:]mm[[:]ss[[.]SSS]]]]]");
    ParsePosition pos = new ParsePosition(0);
    TemporalAccessor result = dtf.parse(text, pos);

However neither uuuuMMdd nor [-] or ['-'] worked for me in Java 8.




回答3:


create a lookup table of DateFormatters or whatever you're using based on the length of the input String and the occurrence of 'T'




回答4:


You can create a DateTimeFormatter with DateTimeFormatterBuilder, which has a method called parseDefaulting(). parseDefaulting() could set the default value if there is no matching.



来源:https://stackoverflow.com/questions/52815456/parsing-an-iso-8601-date-in-java8-when-all-fields-including-separators-but-not

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!