$string = iconv("UTF-8", "UTF-8//IGNORE", $string);
I thought this code would remove invalid UTF-8 characters, but it produces [E_NOTICE] "iconv(): Detected an illegal character in input string"
. What am I missing, how do I properly strip a string from illegal characters?
The output character set (the second parameter) should be different from the input character set (first param). If they are the same, then if there are illegal UTF-8 characters in the string, iconv
will reject them as being illegal according to the input character set.
I know 2 methods how to fix UTF-8 string containing illegal characters:
- Illegal characters will be replaced by question marks ("?"):
$message = mb_convert_encoding($message, 'UTF-8', 'UTF-8');
- Illegal characters will be removedL
$message = iconv('UTF-8', 'UTF-8//IGNORE', $message);
The second method actually was described in question. But it doesn't produce any E_NOTICE
in my case. I tested with different corrupted UTF-8 strings with error_reporting(E_ALL);
and always result was as expected. Possible something was changed since 2012. I tested on PHP 7.2.9 Win.
To simply ignore notice, you can use "@":
$string = @iconv("UTF-8", "UTF-8//IGNORE", $string);
来源:https://stackoverflow.com/questions/9375909/iconv-utf-8-ignore-still-produces-illegal-character-error