$_POST will convert from utf-8 to ä ö ü etc

ⅰ亾dé卋堺 提交于 2019-12-04 00:11:19
gioele

You are facing many different problems at the same, let's start with the simplest one.

Problem 1) You say that echo $_POST['field']; will display it correctly? What do you mean with "display"? It can be displayed correctly in two cases:

  • either the field is in UTF-8 and your page has been declared as UTF-8 and the browser is displaying it as UTF-8 or,
  • the field is in Latin-1 and the browser has decided (through the auto-detection heuristics) that your page is in Latin-1.

So, the fact that echo $_POST['field']; is correct tells you nothing.

Problem 2) You are using

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
header('Content-Type:text/html; charset=UTF-8');

Is this PHP code? If it is, it will be an error because the header must be set before sending out any byte. If you do this you will not set the Content-Type header and PHP should generate a warning.

Problem 3) You are using

<form action="whatever.php" accept-charset="UTF-8">

Some browsers (IE, mostly) ignore accept-charset if they can coerce the data to be sent in ASCII or ISO Latin-1. So the data will be in UTF-8 and declared as ISO Latin-1 or ISO Latin-1 and sent as ISO Latin-1 (but this second case is not your case).

Have a look at https://stackoverflow.com/a/8547004/449288 to see how to solve this problem.

Problem 4) Which strings are you comparing? For example, if you have

$city = "München"
$_POST['city'] == $city

The result of this code will depend on the encoding of the PHP file. If the file is encoded in ISO Latin-1 and the $_POST correctly contains UTF-8 data, the == will compare different bytes and will return false.

Another solution that may be helpful is in Apache, you can place a directive in your configuration file (httpd.conf) or .htacess called AddDefaultCharset. It looks like this:

AddDefaultCharset utf-8

http://httpd.apache.org/docs/2.0/mod/core.html#adddefaultcharset

That will override any other default charsets.

I changed "mbstring.detect_order = pass" in my php.ini file and i worked

I've used Unicode characters in my forms and file many times. I had not any problem up to now. Try to do these steps and check the result:

  1. Remove header('Content-Type:text/html; charset=UTF-8'); from your HTML form codes.
  2. Use your form just like <form action="whatever.php"> without accept-charset="UTF-8". (It's better to insert the method of sending data in your form tag).
  3. In target page (whatever.php), insert again <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> in a <head> tag.

I always did my project like what I mentioned here and I did not have any problem with Unicode strings.

This is due to the character encoding of the PHP file(s).

The hardcoded München is stored with the character encoding of the source file(s), in this case ANSI and when that value is compared to the UTF-8 encoded value provided in the $_POST variable, the two will, quite naturally, differ.

The solution to your problem is one of:

  1. Serve and process content with the same encoding as that of the source file(s), in this case likely to be windows-1252.
    • This would, for starters, include changing the content="text/html; charset=UTF-8" to content="text/html; charset=windows-1252" whenever serving HTML data.
  2. Avoid all hardcoded values that could be affected by character encoding issues between UTF-8 and windows-1252, more or less only hardcode values that only includes English letters and numbers.
    • Any UTF-8 values would have to be read from a source that ensures they are UTF-8 encoded (for instance a database set to use UTF-8 as storage encoding as well as connection encoding).
  3. Wrap all hardcoded assignments in utf8_encode(), for instance $value = utf8_encode ('München');
  4. Change the encoding of the source file(s) to UTF-8.
    • This can be accomplished in any number of ways, a decent text editor will be able to do it or the outstanding libiconv can be used, especially for batch processing.

Either solution 1 or 4 would be my preferred solution, especially if multiple people are involved in the project.

As a side-note, some text editors (notably Notepad++) has the option of using either UTF-8 or UTF-8 without BOM. The BOM (Byte Order Mark) is pointless in UTF-8 and will cause problems when writing headers in PHP (most often when doing a redirect). This is because the BOM is right in front of the initial <?php, causing the server to send the BOM just as it would had there been any other character in front. The difference is you'd note a character in front, but the BOM isn't displayed.
Rule of thumb: Always use UTF-8 without BOM.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!