How to Parse Date Strings with 🎌 Japanese Numbers in Java DateTime API

笑着哭i 提交于 2021-02-18 22:09:13

问题


After asking [How to parse 🎌 Japanese Era Date string values into LocalDate & LocalDateTime],
I was curious about the following case;

明治二十三年十一月二十九日

Is there a way to parse Japanese numbers on top of Japanese Calendar characters, essentially a pure Japanese date, into LocalDate? Using only Java DateTime API. I don't want to modify the input String values, but want just API to handle the recognition.


回答1:


For anyone reading along, your example date string holds an era designator, year of era of 23 (in this case correspinding to 1890 CE Gregorian), month 11 and day of month 29. Months and days are the same as in the Gregorian calendar.

Since Japanese numbers are not entirely positional (like Arabic numbers, for example), a DateTimeFormatter doesn’t parse them on its own. So we help it by supplying how the numbers look in Japanese (and Chinese). DateTimeFormatterBuilder has an overloaded appendText method that accepts a map holding all the possible numbers as text. My code example is not complete, but should get you started.

    Locale japaneseJapan = Locale.forLanguageTag("ja-JP");

    Map<Long, String> numbers = Map.ofEntries(
            Map.entry(1L, "\u4e00"),
            Map.entry(2L, "\u4e8c"),
            Map.entry(3L, "\u4e09"),
            Map.entry(4L, "\u56db"),
            Map.entry(5L, "\u4e94"),
            Map.entry(6L, "\u516d"),
            Map.entry(7L, "\u4e03"),
            Map.entry(8L, "\u516b"),
            Map.entry(9L, "\u4e5d"),
            Map.entry(10L, "\u5341"),
            Map.entry(11L, "\u5341\u4e00"),
            Map.entry(12L, "\u5341\u4e8c"),
            Map.entry(13L, "\u5341\u4e09"),
            Map.entry(14L, "\u5341\u56db"),
            Map.entry(15L, "\u5341\u4e94"),
            Map.entry(16L, "\u5341\u516d"),
            Map.entry(17L, "\u5341\u4e03"),
            Map.entry(18L, "\u5341\u516b"),
            Map.entry(19L, "\u5341\u4e5d"),
            Map.entry(20L, "\u4e8c\u5341"),
            Map.entry(21L, "\u4e8c\u5341\u4e00"),
            Map.entry(22L, "\u4e8c\u5341\u4e8c"),
            Map.entry(23L, "\u4e8c\u5341\u4e09"),
            Map.entry(24L, "\u4e8c\u5341\u56db"),
            Map.entry(25L, "\u4e8c\u5341\u4e94"),
            Map.entry(26L, "\u4e8c\u5341\u516d"),
            Map.entry(27L, "\u4e8c\u5341\u4e03"),
            Map.entry(28L, "\u4e8c\u5341\u516b"),
            Map.entry(29L, "\u4e8c\u5341\u4e5d"),
            Map.entry(30L, "\u4e09\u4e8c\u5341"));

    DateTimeFormatter japaneseformatter = new DateTimeFormatterBuilder()
            .appendPattern("GGGG")
            .appendText(ChronoField.YEAR_OF_ERA, numbers)
            .appendLiteral('\u5e74')
            .appendText(ChronoField.MONTH_OF_YEAR, numbers)
            .appendLiteral('\u6708')
            .appendText(ChronoField.DAY_OF_MONTH, numbers)
            .appendLiteral('\u65e5')
            .toFormatter(japaneseJapan)
            .withChronology(JapaneseChronology.INSTANCE);

    String dateString = "明治二十三年十一月二十九日";
    System.out.println(dateString + " is parsed into " + LocalDate.parse(dateString, japaneseformatter));

The output from this example is:

明治二十三年十一月二十九日 is parsed into 1890-11-29

Assuming that an era can be longer than 30 years, you need to supply yet more numbers to the map. You can do that a lot better than I can (and can also check my numbers for bugs). It’s probably best (less error-prone) to use a couple of nested loops for filling the map, but I wasn’t sure I could do it correctly, so I am leaving that part to you.

Today I learned something about Japanese numerals.

Some links I used

  • Japanese numerals
  • Unicode characters for Chinese and Japanese numbers



回答2:


Late answer, but the accepted answer is somehow lengthy and not so easy to complete so I think my proposal is a good and powerful alternative.

Use my lib Time4J which supports Japanese numerals out of the box and then use the embedded Japanese calendar:

String input = "明治二十三年十一月二十九日";
ChronoFormatter<JapaneseCalendar> f =
    ChronoFormatter.ofPattern(
        "GGGGy年M月d日",
        PatternType.CLDR,
        Locale.JAPANESE,
        JapaneseCalendar.axis()
    ).with(Attributes.NUMBER_SYSTEM, NumberSystem.JAPANESE);
JapaneseCalendar jcal = f.parse(input);
LocalDate gregorian = jcal.transform(PlainDate.axis()).toTemporalAccessor();
System.out.println(gregorian); // 1890-11-29

This solution is not just shorter but even works for historic Japanese dates before Meiji 6 (based on the old lunisolar calendar in those ancient times). Furthermore, the gannen-notation for the first year of an era (actually we have such a year) is much better supported than in standard java (where you have to apply again a lengthy workaround using a customized map).



来源:https://stackoverflow.com/questions/57215918/how-to-parse-date-strings-with-japanese-numbers-in-java-datetime-api

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!