How can I know if url-encoded string is UTF-8 or Latin-1 with PHP?

非 Y 不嫁゛ 提交于 2019-12-08 01:57:49

问题


I am getting data from various site through url. Url parameters are url-encoded with the php urlencode() function, but character encoding can be still be UTF-8 or Latin-1.

For example, the é character, when url-encoded from UTF-8 becomes %C3%A9 but when url-encoded from Latin-1, it becomes %E9.

When I get data through url, I use urldecode() and then I need to know what is the character encoding so I eventually use utf8_encode before I insert them in a MySQL database.

Strangely, the following code doesn't work :

$x1 = 'Cl%C3%A9ment';
$x2 = 'Cl%E9ment';

echo mb_detect_encoding(urldecode($x1)).' / '.mb_detect_encoding(urldecode($x2));

It returns UTF-8 / UTF-8

Why is that, what am I doing wrong and how can I know the character encoding of those string ?

Thanks


回答1:


mb_detect_encoding() is normally useless with the default second parameter:

<?php

$x1 = 'Cl%C3%A9ment';
$x2 = 'Cl%E9ment';

$encoding_list = array('utf-8', 'iso-8859-1');

var_dump(
    mb_detect_encoding(urldecode($x1), $encoding_list),
    mb_detect_encoding(urldecode($x2), $encoding_list)
);

... prints:

string(5) "UTF-8"
string(10) "ISO-8859-1"


来源:https://stackoverflow.com/questions/21384050/how-can-i-know-if-url-encoded-string-is-utf-8-or-latin-1-with-php

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!