Detecting utf8 broken characters in MySQL

后端 未结 18 2136
广开言路
广开言路 2020-12-02 05:03

I\'ve got a database with a bunch of broken utf8 characters scattered across several tables. The list of characters isn\'t very extensive AFAIK (áéíúóÁÉÍÓÚÑñ)

Fixing

18条回答
  •  难免孤独
    2020-12-02 05:36

    Based on data in this post https://www.i18nqa.com/debug/utf8-debug.html I'd suggest this is a good query of identifying dodgy entries and possible correct values:

    SELECT my_field,CONVERT(BINARY CONVERT(my_field USING latin1) USING utf8mb4) AS new_field_value FROM my_table WHERE my_field REGEXP '[âÆËÅÂÃ]';

    Be very careful because we had a bad encoding of a file name, but an OK encoding of the path, and in that case some of the solutions above would have caused a world of pain. If some of your data is already correctly encoded in UTF8 you'll likely find you lost a chunk of it.

提交回复
热议问题