Utf-8 characters displayed as ISO-8859-1

自古美人都是妖i 提交于 2020-01-02 06:35:10

问题


I've got an issue with inserting/reading utf8 content from a db. All verifications I'm doing seem to point to the fact that the content in my DB should be utf8 encoded, however it seems to be latin encoded. The data are initially imported from a PHP script from the CLI.

Configuration:

Zend Framework Version: 1.10.5
mysql-server-5.0:   5.0.51a-3ubuntu5.7
php5-mysql:     5.2.4-2ubuntu5.10
apache2:        2.2.8-1ubuntu0.16
libapache2-mod-php5:    5.2.4-2ubuntu5.10

Vertifications:

-mysql:

mysql> SHOW VARIABLES LIKE 'character_set%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)

mysql> SHOW VARIABLES LIKE 'collation%';
+----------------------+-----------------+
| Variable_name        | Value           |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database   | utf8_bin        |
| collation_server     | utf8_general_ci |
+----------------------+-----------------+

-database

created with 
CREATE DATABASE mydb CHARACTER SET utf8 COLLATE utf8_bin;
CREATE SCHEMA `mydb` DEFAULT CHARACTER SET utf8 COLLATE utf8_bin ;

mysql> status;
--------------
mysql  Ver 14.12 Distrib 5.0.51a, for debian-linux-gnu (i486) using readline 5.2

Connection id:          7
Current database:       mydb
Current user:           root@localhost
SSL:                    Not in use
Current pager:          stdout
Using outfile:          ''
Using delimiter:        ;
Server version:         5.0.51a-3ubuntu5.7-log (Ubuntu)
Protocol version:       10
Connection:             Localhost via UNIX socket
Server characterset:    utf8
Db     characterset:    utf8
Client characterset:    utf8
Conn.  characterset:    utf8
UNIX socket:            /var/run/mysqld/mysqld.sock
Uptime:                 9 min 45 sec

-sql: before doing my inserts I run the

SET names 'utf8';

-php: before doing my inserts I use utf8_encode() and mb_detect_encoding() which gives me 'UTF-8'. After retrieveing the content from db and before sending it to the user mb_detect_encoding() also gives 'UTF-8'

Validation test:

the only way for me to have the content displayed properly is to set the content type to latin (If I sniff the traffic I can see the content-type header with ISO-8859-1):

ini_set('default_charset', 'ISO-8859-1');

This test shows that the content comes out as latin. I don't understand why. Does anybody have any idea?

Thanks.


回答1:


Well, I've found that SET NAMES isn't really all that great. Take a peak at the docs...

What I typically do is execute 4 queries:

SET CHARACTER SET 'UTF8';
SET character_set_database = 'UTF8';
SET character_set_connection = 'UTF8';
SET character_set_server = 'UTF8';

Give that a shot and see if that does it for you...

Oh, and remember, all UTF-8 characters <= 127 are valid ISO-8859-1 characters as well. So if you only have characters <= 127 in the stream, mb_detect_encoding will fall on the higher prevalence charset (which is by default "UTF-8")...




回答2:


  1. What are you doing before retrieval? Also a 'SET NAMES utf8;'? Otherwise, MySQL will silently convert to the charset the connection indicates as used.
  2. If not even that, what does a SHOW FULL COLUMNS FROM table; show? Having a table with a default charset does not mean the column is. i.e, this is valid:

.

CREATE TABLE test (
    `name` varchar(10) character set latin1
) CHARSET=utf8


来源:https://stackoverflow.com/questions/3311243/utf-8-characters-displayed-as-iso-8859-1

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!