The ICU project (which also now has a PHP library) contains the classes needed to help normalize UTF-8 strings to make it easier to compare values when searching.
Ho
If two unicode strings are canonically equivalent the strings are really the same, only using different unicode sequences. For example Ä can be represented either using the character Ä or a combination of A and ◌̈.
If the strings are only compatibility equivalent the strings aren't necessarily the same, but they may be the same in some contexts. E.g. ff could be considered same as ff.
So, if you are comparing strings you should use canonical equivalence, because compatibility equivalence isn't real equivalence.
But if you want to sort a set of strings it might make sense to use compatibility equivalence as the are nearly identical.