I imported some data using LOAD DATA INFILE into a MySQL Database. The table itself and the columns are using the UTF8 character set, but the default character set of the d
Though it is hardly still actual for the OP, I happen to have found a solution in MySQL documentation for ALTER TABLE. I post it here just for future reference:
Warning
The CONVERT TO operation converts column values between the character sets. This is not what you want if you have a column in one character set (like latin1) but the stored values actually use some other, incompatible character set (like utf8). In this case, you have to do the following for each such column:
ALTER TABLE t1 CHANGE c1 c1 BLOB;
ALTER TABLE t1 CHANGE c1 c1 TEXT CHARACTER SET utf8;
The reason this works is that there is no conversion when you convert to or from BLOB columns.
Try this:
1) Dump your DB
mysqldump --default-character-set=latin1 -u username -p databasename < dump.sql
2) Open dump.sql in text editor and replace all occurences of "SET NAMES latin1" by "SET NAMES utf8"
3) Create a new database and restore your dumpfile
cat dump.sql | mysql -u root -p newdbname
I wrote that http://code.google.com/p/mysqlutf8convertor/ for Latin Database to UTF-8 Database. All tables and field to change UTF-8.
Converting latin1 to UTF8 is not what you want to do, you kind of need the opposite.
If what really happened was this:
What you must do now is:
I recently completed a shell script that automates the conversion process. It is also configurable to write custom filters for any text you wish to replace or remove. For example : stripping HTML characters etc. Table whitelists and blacklists are also possible. You can download it at sourceforge: https://sourceforge.net/projects/mysqltr/
I've had cases like this in old wordpress installations with the problem being that the data itself was already in UTF-8 within a Latin1 database (due to WP default charset). This means there was no real need for conversion of the data but the ddbb and table formats. In my experience things get messed up when doing the dump as I understand MySQL will use the client's default character set which in many cases is now UTF-8. Therefore making sure that exporting with the same coding of the data is very important. In case of Latin1 DDBB with UTF-8 coding:
$ mysqldump –default-character-set=latin1 –databases wordpress > m.sql
Then replace the Latin1 references within the exported dump before reimporting to a new database in UTF-8. Sort of:
$ replace "CHARSET=latin1" "CHARSET=utf8" \
"SET NAMES latin1" "SET NAMES utf8" < m.sql > m2.sql
In my case this link was of great help. Commented here in spanish.