Mysql bulgarian languages, character set

旧街凉风 提交于 2020-01-03 00:23:32

问题


I have a Mysql table with multiple languages, one language a field.

My character set is utf_general_ci

When I look into the table with phpMyAdmin I have a bulgarian page which looks like this:

За наÑ

This is a title. This same title shows up in the website like this:

За нас  (this is correct)

What am I doing wrong?


回答1:


OK, try to execute these queries before your actual fetching of the records:

mysql_query("SET NAMES 'utf8'");
mysql_query("SET character_set_results = 'utf8', character_set_client = 'utf8', character_set_connection = 'utf8', character_set_database = 'utf8', character_set_server = 'utf8'");

Afterwards proceed with execution of your queries. The above queries, if course, must be in context of your current database connection.




回答2:


This looks like the data is UTF-8 encoded and hence works well on a web page declared as UTF-8 encoded but not when a program cannot handle or has not been set to apply UTF-8.

For example, the characters °Ñ that occur twice are U+00B0 U+00D1. The bytes 0xB0 and 0xD1 are the UTF-8 form of the cyrillic small letter a, U+0430, which appears in the corresponding positions in the correct text. So apparently UTF-8 data is being misinterpreted according to ISO-8859-1, Windows-1252, or some similar 8-bit encoding.




回答3:


What character set do the fields in your table use ? Can you please share the relevant part of the SHOW CREATE TABLE command for these fields ?

Since ISO-8859-1 is the default database charset for mysql and it's mostly not doing any conversions people use it as BINARY and just store UTF-8 encoded Cyrillic into it. This works well with web development tools, because they bind to the field and receive the data as UTF-8 encoded binary bytes and then, without conversion, put it in a web page that says it uses utf-8 encoding for its output. So data just pass through without being properly encoded for the database to use. Of course this causes all kinds of problems when you do operations inside the database (e.g. get the character vs. byte length and try to sort properly). But for basic store/retrieve operations it looks like it's working. This is a very typical behavior for non-localized web apps that assume they're working with ASCII or ISO-8859-1 at most. The remedy to that is to create new set of tables using the UTF-8 encoding and then explicitly transcode the wrongly encoded utf-8 data to wide chars and then put these into the utf-8 table so the database is aware of the right encoding used.



来源:https://stackoverflow.com/questions/9391140/mysql-bulgarian-languages-character-set

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!