How Convert.ToDateTime() parses a given string when the given culture does not know the format

落爺英雄遲暮 提交于 2019-12-13 14:31:56

问题


I have the following code, and it works.

string testDateStr = "2009.7.28 05:23:15";
DateTime testDateObj = Convert.ToDateTime(testDateStr, CultureInfo.GetCultureInfo("fr-FR"));

And I checked the valid formats for my culture:

string[] validFormats = testDateObj.GetDateTimeFormats(CultureInfo.GetCultureInfo("fr-FR"));

And none of them matches the "2009.7.28 05:23:15" format. I want to know how this is parsed without throwing a format exception, and what kind of hidden parsing is done when we call the Convert.ToDateTime().

Update: I tried the following after LakshmiNarayanan's answer.

foreach(var culture in CultureInfo.GetCultures(CultureTypes.AllCultures))
{
    foreach(var format in testDateObj.GetDateTimeFormats(culture))
    {
        if (format == testDateStr)
        {
            Console.WriteLine(culture.DisplayName);
        }
    }
}

There are cultures which actually contain the format my string is in, but still it fails to explain why it does not throw an exception when we ask it to convert using a specific culture while that culture does not know the format the string is in.


回答1:


The Convert.ToDateTime method internally uses DateTime.Parse method, which is based on internal sophisticated Lex method. There is a bunch of rules applied to passed string. It is split into tokens and each token is analyzed. The analysis is really complex and I'm going to show only couple of rules.

If token is composed of digits and has length 3~8, then this token is going to be the year, that's why it is possible to parse 01.2014.01 string, which would produce 01 Jan 2014 result. Note that you can also parse strings like 01 2014 01 or 01\n2014\n01 giving the same result. You can separate tokens using whitespaces or , or . symbols.

If token is the name of the month, then it is going to be the month (the table or tokens is built within internal DateTimeFormatInfo.CreateTokenHashTable method). So it doesn't matter where do you locate Feb or February. You can equally parse 2014 1 Jan or 2014.Jan.1, 2014...,Jan..,..1 or even 5Jan2014 string (the last one doesn't use any separator, but it checks where the number ends, so it is successfully split into 5, Jan and 2014 tokens).

If we have ambiguous string 01/04, then the information from culture is taken to resolve the order of day/month. The order is extracted from DateTimeFormatInfo.MonthDayPattern. For example, for en-US it is MMMM dd, and for en-GB it is dd MMMM. There is private static bool GetMonthDayOrder(string pattern, DateTimeFormatInfo dtfi, out int order) method in internal System.DateTimeParse class, which is used to extract the order. If the order variable takes value 6, then it is MM/dd, if it takes value 7, then it is dd/MM. Note that it will not try to do some heuristics about 01/31, only order extracted from culture is considered. Here is the code for test:

CultureInfo ci = CultureInfo.GetCultureInfo("en-US");

DateTimeFormatInfo dtfi = ci.DateTimeFormat;
Assembly coreAssembly = Assembly.ReflectionOnlyLoad("mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089");
Type dateTimeParseType = coreAssembly.GetType("System.DateTimeParse");
MethodInfo getMonthDayOrderMethodInfo = dateTimeParseType.GetMethod("GetMonthDayOrder", BindingFlags.Static | BindingFlags.NonPublic);
object[] parameters = new object[] { dtfi.MonthDayPattern, dtfi, null };
getMonthDayOrderMethodInfo.Invoke(null, parameters);
int order = (int)parameters[2];
switch (order)
{
    case -1:
        Console.WriteLine("Cannot extract information");
        break;
    case 6:
        Console.WriteLine("MM/dd");
        break;
    case 7:
        Console.WriteLine("dd/MM");
        break;
}

And a lot of other checks against AM/PM pattern, day of week, time suffix (for example for Korean language the suffix is considered, meaning hour), etc.

The following code would produce information about culture-specific tokens:

DateTimeFormatInfo dti = CultureInfo.InvariantCulture.DateTimeFormat;
dynamic hashes = dti.GetType().GetMethod("CreateTokenHashTable", BindingFlags.Instance | BindingFlags.NonPublic).Invoke(dti, null);
var tokens = Enumerable.Repeat(new { str = "", type = "", value = "" }, 0).ToList();
foreach (dynamic hash in hashes)
    if (hash != null)
    {
        Type hashType = hash.GetType();
        tokens.Add(new { str = (string)hashType.GetField("tokenString", BindingFlags.Instance | BindingFlags.NonPublic).GetValue(hash).ToString(),
                         type = (string)hashType.GetField("tokenType", BindingFlags.Instance | BindingFlags.NonPublic).GetValue(hash).ToString(),
                         value = (string)hashType.GetField("tokenValue", BindingFlags.Instance | BindingFlags.NonPublic).GetValue(hash).ToString() });
    }
foreach (var token in tokens.Distinct().OrderBy(t => t.type).ThenBy(t => t.value))
    Console.WriteLine("{0,10} {1} {2}", token.str, token.type, token.value);

For InvariantCulture the output is:

        AM 1027 0
        PM 1284 1
    Sunday DayOfWeekToken 0
       Sun DayOfWeekToken 0
    Monday DayOfWeekToken 1
       Mon DayOfWeekToken 1
   Tuesday DayOfWeekToken 2
       Tue DayOfWeekToken 2
 Wednesday DayOfWeekToken 3
       Wed DayOfWeekToken 3
       Thu DayOfWeekToken 4
  Thursday DayOfWeekToken 4
    Friday DayOfWeekToken 5
       Fri DayOfWeekToken 5
       Sat DayOfWeekToken 6
  Saturday DayOfWeekToken 6
        AD EraToken 1
      A.D. EraToken 1
         , IgnorableSymbol 0
         . IgnorableSymbol 0
   January MonthToken 1
       Jan MonthToken 1
   October MonthToken 10
       Oct MonthToken 10
  November MonthToken 11
       Nov MonthToken 11
  December MonthToken 12
       Dec MonthToken 12
  February MonthToken 2
       Feb MonthToken 2
     March MonthToken 3
       Mar MonthToken 3
       Apr MonthToken 4
     April MonthToken 4
       May MonthToken 5
      June MonthToken 6
       Jun MonthToken 6
       Jul MonthToken 7
      July MonthToken 7
       Aug MonthToken 8
    August MonthToken 8
 September MonthToken 9
       Sep MonthToken 9
         / SEP_Date 0
         - SEP_DateOrOffset 0
         日 SEP_DaySuff 0
         일 SEP_DaySuff 0
         时 SEP_HourSuff 0
         時 SEP_HourSuff 0
         T SEP_LocalTimeMark 0
         分 SEP_MinuteSuff 0
         月 SEP_MonthSuff 0
         월 SEP_MonthSuff 0
         秒 SEP_SecondSuff 0
         : SEP_Time 0
         년 SEP_YearSuff 0
         年 SEP_YearSuff 0
       GMT TimeZoneToken 0
         Z TimeZoneToken 0

For fr-FR culture (notice that July is included in the list as well as other tokens from InvariantCulture)

        AM 1027 0
        PM 1284 1
         h DateWordToken 0
  dimanche DayOfWeekToken 0
       Sun DayOfWeekToken 0
      dim. DayOfWeekToken 0
    Sunday DayOfWeekToken 0
     lundi DayOfWeekToken 1
    Monday DayOfWeekToken 1
      lun. DayOfWeekToken 1
       Mon DayOfWeekToken 1
   Tuesday DayOfWeekToken 2
       Tue DayOfWeekToken 2
     mardi DayOfWeekToken 2
      mar. DayOfWeekToken 2
  mercredi DayOfWeekToken 3
 Wednesday DayOfWeekToken 3
      mer. DayOfWeekToken 3
       Wed DayOfWeekToken 3
     jeudi DayOfWeekToken 4
  Thursday DayOfWeekToken 4
       Thu DayOfWeekToken 4
      jeu. DayOfWeekToken 4
      ven. DayOfWeekToken 5
  vendredi DayOfWeekToken 5
    Friday DayOfWeekToken 5
       Fri DayOfWeekToken 5
    samedi DayOfWeekToken 6
      sam. DayOfWeekToken 6
       Sat DayOfWeekToken 6
  Saturday DayOfWeekToken 6
 ap. J.-C. EraToken 1
         , IgnorableSymbol 0
         . IgnorableSymbol 0
   January MonthToken 1
     janv. MonthToken 1
   janvier MonthToken 1
       Jan MonthToken 1
      oct. MonthToken 10
       Oct MonthToken 10
   octobre MonthToken 10
   October MonthToken 10
      nov. MonthToken 11
       Nov MonthToken 11
  novembre MonthToken 11
  November MonthToken 11
      déc. MonthToken 12
  December MonthToken 12
       Dec MonthToken 12
  décembre MonthToken 12
     févr. MonthToken 2
   février MonthToken 2
  February MonthToken 2
       Feb MonthToken 2
      mars MonthToken 3
     March MonthToken 3
       Mar MonthToken 3
       Apr MonthToken 4
      avr. MonthToken 4
     avril MonthToken 4
     April MonthToken 4
       mai MonthToken 5
       May MonthToken 5
      June MonthToken 6
      juin MonthToken 6
       Jun MonthToken 6
      July MonthToken 7
     juil. MonthToken 7
   juillet MonthToken 7
       Jul MonthToken 7
       Aug MonthToken 8
      août MonthToken 8
    August MonthToken 8
     sept. MonthToken 9
       Sep MonthToken 9
 septembre MonthToken 9
 September MonthToken 9
         / SEP_Date 0
         - SEP_DateOrOffset 0
         日 SEP_DaySuff 0
         일 SEP_DaySuff 0
         时 SEP_HourSuff 0
         時 SEP_HourSuff 0
         T SEP_LocalTimeMark 0
         分 SEP_MinuteSuff 0
         月 SEP_MonthSuff 0
         월 SEP_MonthSuff 0
         秒 SEP_SecondSuff 0
         : SEP_Time 0
         년 SEP_YearSuff 0
         年 SEP_YearSuff 0
       GMT TimeZoneToken 0
         Z TimeZoneToken 0



回答2:


The Datetime.GetDateTimeFormats() method does not list date in the format "2009.7.28 05:23:15" which could be because of the default cultureInfo.

However, if you check out the IFormatProvider overload for GetDateTimeFormats(IFormatProvider) method, you could see that for culture "fr-FR", the method is able to successfully parse dates with "dot" delimiters. eg 28.07.09 5:23:15

Hence the logical assumption to how this works is that, the DateTime.Parse() runs the string through all the possible cultures, if there is not any specific culture provided, and only returns an exception when none of the cultures' string matches.

EDIT:

Digging through MSDN, the Convert.ToDateTime(stringTime) is being parsed with the DateTimeFormatInfo for the current culture.

If value is not null, the return value is the result of invoking the DateTime.Parse method on value using the formatting information in a DateTimeFormatInfo object that is initialized for the current culture. The value argument must contain the representation of a date and time in one of the formats described in the DateTimeFormatInfo topic.

So when no specific culture is set, the DateTimeFormatInfo object refers to the default constructor. Referring to MSDN,

This constructor creates a DateTimeFormatInfo object that represents the date and time information of the invariant culture. To create a DateTimeFormatInfo object for a specific culture, create a CultureInfo object for that culture and retrieve the DateTimeFormatInfo object returned by its CultureInfo.DateTimeFormat property.

So invariant culture is default when no culture is defined. So the Convert.ToDateTime's default string method refers to the default object of DateTimeFormatInfo, which refers to invariant culture. Which means that Convert.ToDateTime has to parse through all validations across all cultures.

Hence our assumption that validations are being checked for all culture variants is correct.

Hope It helps. Kudos, a really interesting observation.




回答3:


Possible ways

  dateString = "05/01/1996";
  ConvertToDateTime(dateString);
  dateString = "Tue Apr 28, 2009";
  ConvertToDateTime(dateString);
  dateString = "Wed Apr 28, 2009";
  ConvertToDateTime(dateString);
  dateString = "06 July 2008 7:32:47 AM";
  ConvertToDateTime(dateString);
  dateString = "17:32:47.003";
  ConvertToDateTime(dateString);
  // Convert a string returned by DateTime.ToString("R").
  dateString = "Sat, 10 May 2008 14:32:17 GMT";
  ConvertToDateTime(dateString);
  // Convert a string returned by DateTime.ToString("o").
  dateString = "2009-05-01T07:54:59.9843750-04:00";
  ConvertToDateTime(dateString);


int year=2009;
int month=7;
int day=28;
int hr=5;
int min=23;
int s=15;   

DateTime testDateObj = Convert.ToDateTime(year, month, day, hr, min, s);

or simply

DateTime testDateObj = Convert.ToDateTime(2009, 7, 28, 5, 23, 15);



回答4:


Try using class DateTimeFormatInfo for have information abount datetime format i your culture.




回答5:


UPDATED:-Try this:

string testDateStr = "2009.7.28 05:23:15";
    string testDateObj = Convert.ToDateTime(testDateStr).Date.ToString("d");

    string[] validFormats = (Convert.ToDateTime(testDateObj)).GetDateTimeFormats();
    foreach(string s in validFormats)
    {
       lblresult.Text += s;
    }


来源:https://stackoverflow.com/questions/22809884/how-convert-todatetime-parses-a-given-string-when-the-given-culture-does-not-k

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!