URL Decode Difference between C# and Java

问题

I got a url encode string %B9q

while I use C# code:

string res = HttpUtility.UrlDecode("%B9q", Encoding.GetEncoding("Big5"));

It outputs as 電，which is the correct answer that I want

But when I use Java decode function:

String res = URLDecoder.decode("%B9q", "Big5");

Then I got the output ?q

Does anyone knows how it happens and how should I solve it?

Thanks for any suggestions and helps!

回答1:

As far as I can tell from the relevant spec, it looks like Java's way of handling things is correct.

Especially the example presented when discussing URI to IRI conversion seems meaningful:

Conversions from URIs to IRIs MUST NOT use any character encoding other than UTF-8 in steps 3 and 4, even if it might be possible to guess from the context that another character encoding than UTF-8 was used in the URI. For example, the URI "http://www.example.org/r%E9sum%E9.html" might with some guessing be interpreted to contain two e-acute characters encoded as iso-8859-1. It must not be converted to an IRI containing these e-acute characters. Otherwise, in the future the IRI will be mapped to "http://www.example.org/r%C3%A9sum%C3%A9.html", which is a different URI from "http://www.example.org/r%E9sum%E9.html".

回答2:

Maybe Java's URLDecoder ignore some rules on big5 encoding standard. C# do same things as browsers like Chrome, but Java's URLDecoder doesn't. See the relevant question: https://stackoverflow.com/a/27635806/1321255

来源：https://stackoverflow.com/questions/24254527/url-decode-difference-between-c-sharp-and-java

标签

java

decode