I am trying to convert Word text pasted by users that contain MS Word ellipsis and long dash before processing it further.
I found an old proposed solution here to t
Hmm. I use this function for sanitizing text copied into an RTE. It may or may not work in this case. It converts to HTML entities, but you could tweak it to just convert to regular characters:
function convertFromCP1252($string)
{
$search = array('&',
'<',
'>',
'"',
chr(212),
chr(213),
chr(210),
chr(211),
chr(209),
chr(208),
chr(201),
chr(145),
chr(146),
chr(147),
chr(148),
chr(151),
chr(150),
chr(133),
chr(194)
);
$replace = array( '&',
'<',
'>',
'"',
'‘',
'’',
'“',
'”',
'–',
'—',
'…',
'‘',
'’',
'“',
'”',
'–',
'—',
'…',
''
);
return str_replace($search, $replace, $string);
}