What MySQL collation is best for accepting all unicode characters?

前端 未结 1 435
栀梦
栀梦 2020-12-31 01:59

Our column is currently collated to latin1_swedish_ci and special unicode characters are, obviously, getting stripped out. We want to be able to accept chars su

相关标签:
1条回答
  • 2020-12-31 02:25

    The collation is the least of your worries, what you need to think about is the character set for the column/table/database. The collation (rules governing how data is compared and sorted) is just a corollary of that.

    MySQL supports several Unicode character sets, utf8 and utf8mb4 being the most interesting. utf8 supports Unicode characters in the BMP, i.e. a subset of all of Unicode. utf8mb4, available since MySQL 5.5.3, supports all of Unicode.

    The collation to be used with any of the Unicode encodings is most likely xxx_general_ci or xxx_unicode_ci. The former is a general sorting and comparison algorithm independent of language, the latter is a more complete language independent algorithm supporting more Unicode features (e.g. treating "ß" and "ss" as equivalent), but is therefore also slower.

    See https://dev.mysql.com/doc/refman/5.5/en/charset-unicode-sets.html.

    0 讨论(0)
提交回复
热议问题