MySQL Convert latin1 data to UTF8

前端 未结 7 1046
-上瘾入骨i
-上瘾入骨i 2020-12-03 02:30

I imported some data using LOAD DATA INFILE into a MySQL Database. The table itself and the columns are using the UTF8 character set, but the default character set of the d

相关标签:
7条回答
  • 2020-12-03 02:40

    Though it is hardly still actual for the OP, I happen to have found a solution in MySQL documentation for ALTER TABLE. I post it here just for future reference:

    Warning

    The CONVERT TO operation converts column values between the character sets. This is not what you want if you have a column in one character set (like latin1) but the stored values actually use some other, incompatible character set (like utf8). In this case, you have to do the following for each such column:

    ALTER TABLE t1 CHANGE c1 c1 BLOB;
    ALTER TABLE t1 CHANGE c1 c1 TEXT CHARACTER SET utf8;
    

    The reason this works is that there is no conversion when you convert to or from BLOB columns.

    0 讨论(0)
  • 2020-12-03 02:42

    Try this:

    1) Dump your DB

    mysqldump --default-character-set=latin1 -u username -p databasename < dump.sql
    

    2) Open dump.sql in text editor and replace all occurences of "SET NAMES latin1" by "SET NAMES utf8"

    3) Create a new database and restore your dumpfile

    cat dump.sql | mysql -u root -p newdbname
    
    0 讨论(0)
  • 2020-12-03 02:44

    I wrote that http://code.google.com/p/mysqlutf8convertor/ for Latin Database to UTF-8 Database. All tables and field to change UTF-8.

    0 讨论(0)
  • 2020-12-03 02:49

    Converting latin1 to UTF8 is not what you want to do, you kind of need the opposite.

    If what really happened was this:

    1. UTF-8 strings were interpreted as Latin-1 and transcoded to UTF-8, mangling them.
    2. You are now, or could be, reading UTF-8 strings with no further interpretation

    What you must do now is:

    1. Read the "UTF-8" with no transcode.
    2. Convert it to Latin-1. Now you should actually have the original UTF-8.
    3. Now put it in your "UTF-8" column with no further conversion.
    0 讨论(0)
  • 2020-12-03 02:52

    I recently completed a shell script that automates the conversion process. It is also configurable to write custom filters for any text you wish to replace or remove. For example : stripping HTML characters etc. Table whitelists and blacklists are also possible. You can download it at sourceforge: https://sourceforge.net/projects/mysqltr/

    0 讨论(0)
  • 2020-12-03 02:54

    I've had cases like this in old wordpress installations with the problem being that the data itself was already in UTF-8 within a Latin1 database (due to WP default charset). This means there was no real need for conversion of the data but the ddbb and table formats. In my experience things get messed up when doing the dump as I understand MySQL will use the client's default character set which in many cases is now UTF-8. Therefore making sure that exporting with the same coding of the data is very important. In case of Latin1 DDBB with UTF-8 coding:

    $ mysqldump –default-character-set=latin1 –databases wordpress > m.sql
    

    Then replace the Latin1 references within the exported dump before reimporting to a new database in UTF-8. Sort of:

    $ replace "CHARSET=latin1" "CHARSET=utf8" \
        "SET NAMES latin1" "SET NAMES utf8" < m.sql > m2.sql
    

    In my case this link was of great help. Commented here in spanish.

    0 讨论(0)
提交回复
热议问题