Between utf8_general_ci and utf8_unicode_ci, are there any differences in terms of performance?
There are two big difference the sorting and the character matching:
Sorting:
utf8mb4_general_ci removes all accents and sorts one by one which may create incorrect sort results.utf8mb4_unicode_ci sorts accurate.Character Matching
They match characters differently.
For example, in utf8mb4_unicode_ci you have i != ı, but in utf8mb4_general_ci it holds ı=i.
For example, imagine you have a row with name="Yılmaz". Then
select id from users where name='Yilmaz';
would return the row if collocation is utf8mb4_general_ci, but if it is collocated with utf8mb4_unicode_ci it would not return the row!
On the other hand we have that a=ª and ß=ss in utf8mb4_unicode_ci which is not the case in utf8mb4_general_ci. So imagine you have a row with name="ªßi", then
select id from users where name='assi';
would return the row if collocation is utf8mb4_unicode_ci, but would not return a row if collocation is set to utf8mb4_general_ci.
A full list of matches for each collocation may be found here.