What's the behavior the browser encoding URL?

和自甴很熟 提交于 2020-01-03 03:30:33

问题


I'm doing a test, how the Firefox encoding character.

But the fact confused me.

HTML code:

<html lang="zh_CN">
<head>
<title>some Chinese character</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<img src="http://localhost/xxx" />
</body>

The xxx is some Chinese characters. These character must be encode into format like %xx to transport by HTTP.

First, I encoding the source file in UTF-8. use firefox to open the html file. The img label will send a request, "xxx" character were encoded by UTF8.

  • (encode HTML source file by UTF8, charset=utf8, the browser encode URL by UTF)

I changed the meta into <meta http-equiv="Content-Type" content="text/html; charset=gbk"> but nothing changed.

  • (encode HTML source file by UTF8, charset=gbk, the browser encode URL by UTF)

Second, I save the source file in ANSI, maybe GBK or GB2312.

when the charset=gbk, still encoding the character by UTF8.

  • (encode HTML source file by GBK, charset=gbk, the browser encode URL by UTF)

BUT, when the charset=utf8, the characters were encoding by GBK. By the way, other Chinese character can't display in right way, e.g. the String in title.

  • (encode HTML source file by GBK, charset=utf8, the browser encode URL by GBK)

How to control the browser's encoding behavior?


回答1:


UTF-8 is the standard for URL encoding. If you encode your source file physically in GBK, but use utf-8 in the content-type, you are just lying to the browser and will get inconsistent or non-working results.

When a new URI scheme defines a component that represents textual data consisting of characters from the Universal Character Set [UCS], the data should first be encoded as octets according to the UTF-8 character encoding [STD63]; then only those octets that do not correspond to characters in the unreserved set should be percent- encoded. For example, the character A would be represented as "A", the character LATIN CAPITAL LETTER A WITH GRAVE would be represented as "%C3%80", and the character KATAKANA LETTER A would be represented as "%E3%82%A2



来源:https://stackoverflow.com/questions/14001224/whats-the-behavior-the-browser-encoding-url

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!