I have a regular expression to get the initials of a name like below:
/\\b\\p{L}\\./gu
it works fine with English and other languages until there ar
You need to match diacritic marks after base letters using \p{M}*
:
'~\b(?
The pattern matches
\b
- a word boundary(? - the char before the current position must not be a diacritic char (without it, a match can occur within a single word)
\p{L}
- any base Unicode letter\p{M}*
- 0+ diacritic marks\.
- a dot.See the PHP demo online:
$s = "क. ಕ. के. ಕೆ. ";
echo preg_replace('~\b(?$0
', $s);
// => क.
ಕ.
के.
ಕೆ.