Uri.EscapeDataString weirdness

霸气de小男生 提交于 2019-11-30 04:47:06

问题


Why does EscapeDataString behave differently between .NET 4 and 4.5? The outputs are

  • Uri.EscapeDataString("-_.!~*'()") => "-_.!~*'()"

  • Uri.EscapeDataString("-_.!~*'()") => "-_.%21~%2A%27%28%29"

The documentation

By default, the EscapeDataString method converts all characters except for RFC 2396 unreserved characters to their hexadecimal representation. If International Resource Identifiers (IRIs) or Internationalized Domain Name (IDN) parsing is enabled, the EscapeDataString method converts all characters, except for RFC 3986 unreserved characters, to their hexadecimal representation. All Unicode characters are converted to UTF-8 format before being escaped.

For reference, unreserved characters are defined as follows in RFC 2396:

unreserved    = alphanum | mark

mark          = "-" | "_" | "." | "!" | "~" | "*" | "'" |
                (" | ")"

And in RFC 3986:

ALPHA / DIGIT / "-" / "." / "_" / "~"

The source code

It looks like whether each character of EscapeDataString is escaped is determined roughly like this

is unicode above \x7F
  ? PERCENT ENCODE
  : is a percent symbol
    ? is an escape char
      ? LEAVE ALONE
      : PERCENT ENCODE
    : is a forced character
      ? PERCENT ENCODE
      : is an unreserved character
        ? PERCENT ENCODE

It's at that final check "is an unreserved character" where the choice between RFC2396 and RFC3986 is made. The source code of the method verbatim is

    internal static unsafe bool IsUnreserved(char c)
    {
        if (Uri.IsAsciiLetterOrDigit(c))
        {
            return true;
        }
        if (UriParser.ShouldUseLegacyV2Quirks)
        {
            return (RFC2396UnreservedMarks.IndexOf(c) >= 0);
        }
        return (RFC3986UnreservedMarks.IndexOf(c) >= 0);
    }

And that code refers to

    private static readonly UriQuirksVersion s_QuirksVersion = 
        (BinaryCompatibility.TargetsAtLeast_Desktop_V4_5
             // || BinaryCompatibility.TargetsAtLeast_Silverlight_V6
             // || BinaryCompatibility.TargetsAtLeast_Phone_V8_0
             ) ? UriQuirksVersion.V3 : UriQuirksVersion.V2;

    internal static bool ShouldUseLegacyV2Quirks {
        get {
            return s_QuirksVersion <= UriQuirksVersion.V2;
        }
    }

Confusion

It seems contradictory that the documentation says the output of EscapeDataString depends on whether IRI/IDN parsing is enabled, whereas the source code says the output is determined by the value of TargetsAtLeast_Desktop_V4_5. Could someone clear this up?


回答1:


A lot of changes has been done in 4.5 comparing to 4.0 in terms of system functions and how it behaves. U can have a look at this thread

Why does Uri.EscapeDataString return a different result on my CI server compared to my development machine?

or

U can directly go to the following link

http://msdn.microsoft.com/en-us/library/hh367887(v=vs.110).aspx

All this has been with the input from the users around the world.



来源:https://stackoverflow.com/questions/24962514/uri-escapedatastring-weirdness

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!