What MySQL collation is best for accepting all unicode characters?

前端未结

关注

 1  442

Our column is currently collated to latin1_swedish_ci and special unicode characters are, obviously, getting stripped out. We want to be able to accept chars su

相关标签:

1条回答

忘了有多久

2020-12-31 02:25

The collation is the least of your worries, what you need to think about is the character set for the column/table/database. The collation (rules governing how data is compared and sorted) is just a corollary of that.

MySQL supports several Unicode character sets, utf8 and utf8mb4 being the most interesting. utf8 supports Unicode characters in the BMP, i.e. a subset of all of Unicode. utf8mb4, available since MySQL 5.5.3, supports all of Unicode.

The collation to be used with any of the Unicode encodings is most likely xxx_general_ci or xxx_unicode_ci. The former is a general sorting and comparison algorithm independent of language, the latter is a more complete language independent algorithm supporting more Unicode features (e.g. treating "ß" and "ss" as equivalent), but is therefore also slower.

See https://dev.mysql.com/doc/refman/5.5/en/charset-unicode-sets.html.

0 讨论(0)
发布评论:

提交评论
- 加载中...