I don't understand Collation? (Mysql, RDBMS, Character sets)

大城市里の小女人 提交于 2019-11-27 05:20:34

问题


I Understand Character sets but I don't understand Collation. I know you get a default collation with every Character set in Mysql or any RDBMS but I still don't get it! Can someone please explain in layman terms?

Thank you in advance ;-)


回答1:


The main point of a database collation is determining how data is sorted and compared.

Case sensitivity of string comparisons

SELECT "New York" = "NEW YORK";` 

will return true for a case insensitive collation; false for a case sensitive one.

Which collation does which can be told by the _ci and _cs suffix in the collation's name. _bin collations do binary comparisons (strings must be 100% identical).

Comparison of umlauts/accented characters

the collation also determines whether accented characters are treated as their latin base counterparts in string comparisons.

SELECT "Düsseldorf" =  "Dusseldorf";
SELECT "Èclair" =      "Eclair";

will return true in the former case; false in the latter. You will need to read each collation's description to find out which is which.

String sorting

The collation influences the way strings are sorted.

For example,

  • Umlauts Ä Ö Ü are at the end of the alphabet in the finnish/swedish alphabet latin1_swedish_ci

  • they are treated as A O U in German DIN-1 sorting (latin_german1_ci)

  • and as AE OE UE in German DIN-2 sorting (latin_german2_ci). ("phone book" sorting)

  • In latin1_spanish_ci, "ñ" (n-tilde) is a separate letter between "n" and "o".

These rules will result in different sort orders when non-latin characters are used.

Using collations at runtime

You have to choose a collation for your table and columns, but if you don't mind the performance hit, you can force database operations into a certain collation at runtime using the COLLATE keyword.

This will sort table by the name column using German DIN-2 sorting rules:

SELECT name
FROM table
ORDER BY name COLLATE latin1_german2_ci;

Using COLLATE at runtime will have performance implications, as each column has to be converted during the query. So think twice before applying this do large data sets.

MySQL Reference:

  • Character Sets and Collations That MySQL Supports
  • Examples of the Effect of Collation
  • Collation issues



回答2:


Collation is information about how strings should be sorted and compared.

It contains for example case sensetivity, e.g. whether a = A, special character considerations, e.g. whether a = á, and character order, e.g. whether O < Ö.



来源:https://stackoverflow.com/questions/3324900/i-dont-understand-collation-mysql-rdbms-character-sets

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!