What will happen to existing data if I change the collation of a column in MySQL?

前端 未结 5 559
无人及你
无人及你 2021-01-12 02:37

I am running a production application with MySQL database server. I forget to set column\'s collation from latin to utf8_unicode, which results in

5条回答
  •  北恋
    北恋 (楼主)
    2021-01-12 03:24

    Valid data will be properly converted:

    When you change a data type using CHANGE or MODIFY, MySQL tries to convert existing column values to the new type as well as possible. Warning: This conversion may result in alteration of data.

    http://dev.mysql.com/doc/refman/5.5/en/alter-table.html

    ... and more specifically:

    To convert a binary or nonbinary string column to use a particular character set, use ALTER TABLE. For successful conversion to occur, one of the following conditions must apply:[...] If the column has a nonbinary data type (CHAR, VARCHAR, TEXT), its contents should be encoded in the column character set, not some other character set. If the contents are encoded in a different character set, you can convert the column to use a binary data type first, and then to a nonbinary column with the desired character set.

    http://dev.mysql.com/doc/refman/5.1/en/charset-conversion.html

    So your problem is invalid data, e.g., data encoded in a different character set. I've tried the tip suggested by the documentation and it basically ruined my data, but the reason is that my data was already lost: running SELECT column, HEX(column) FROM table showed that multibyte chars had been inserted as 0x3F (i.e., the ? symbol in Latin1). My MySQL stack had been smart enough to detect that input data was not Latin1 and convert it into something "compatible". And once data is gone, you can't get it back.

    To sum up:

    1. Use HEX() to find out if you still have your data.
    2. Make your tests in a copy of your table.

提交回复
热议问题