For a poor man\'s implementation of near-collation-correct sorting on the client side I need a JavaScript function that does efficient single character rep
https://stackoverflow.com/a/37511463
With ES2015/ES6 String.Prototype.Normalize(),
const str = "Crème Brulée" str.normalize('NFD').replace(/[\u0300-\u036f]/g, "") > 'Creme Brulee'
Two things are happening here:
normalize()
ing toNFD
Unicode normal form decomposes combined graphemes into the combination of simple ones. Theè
ofCrème
ends up expressed ase
+̀
.- Using a regex character class to match the U+0300 → U+036F range, it is now trivial to
g
lobally get rid of the diacritics, which the Unicode standard conveniently groups as the Combining Diacritical Marks Unicode block.See comment for performance testing.
Alternatively, if you just want sorting
Intl.Collator has sufficient support ~85% right now, a polyfill is also available here but I haven't tested it.
const c = new Intl.Collator(); ['creme brulee', 'crème brulée', 'crame brulai', 'crome brouillé', 'creme brulay', 'creme brulfé', 'creme bruléa'].sort(c.compare) [ 'crame brulai','creme brulay','creme bruléa','creme brulee', 'crème brulée','creme brulfé','crome brouillé' ] ['creme brulee', 'crème brulée', 'crame brulai', 'crome brouillé'].sort((a,b) => a>b) ["crame brulai", "creme brulee", "crome brouillé", "crème brulée"]