问题
I need to clean a string that comes (copy/pasted) from various Microsoft Office suite applications (Excel, Access, and Word), each with its own set of encoding.
I'm using json_encode for debugging purposes in order to being able to see every single encoded character.
I'm able to clean everything I found so far (\r \n) with str_replace, but with \u00a0 I have no luck.
$string = 'mail@mail.com\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0;mail@mail.com'; //this is the output from json_encode
$clean = str_replace("\u00a0", "",$string);
returns:
mail@mail.com\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0;mail@mail.com
That is exactly the same; it completely ignores \u00a0.
Is there a way around this? Also, I'm feeling I'm reinventing the wheel, is there a function/class that completely strips EVERY possibile char of EVERY possible encoding?
____EDIT____
After the first two replies I need to clarify that my example DOES work, because it's the output from json_encode, not the actual string!
回答1:
Works for me, when I copy/paste your code. Try replacing the double quotes in your str_replace()
with single quotes, or escaping the backslash ("\\u00a0"
).
回答2:
By combining ord()
with substr()
on my string containing \u00a0, I found the following curse to work:
$text = str_replace( chr( 194 ) . chr( 160 ), ' ', $text );
回答3:
I just had the same problem. Apparently PHP's json_encode will return null for any string with a 'non-breaking space' in it.
The Solution is to replace this with a regular space:
str_replace(chr(160),' ');
I hope this helps somebody - it took me an hour to figure out.
回答4:
A minor point: \u00a0 is actually a non-breaking space character, c.f. http://www.fileformat.info/info/unicode/char/a0/index.htm
So it might be more correct to replace it with " "
回答5:
You have to do this with single quotes like this:
str_replace('\u00a0', "",$string);
Or, if you like to use double quotes, you have to escape the backslash - which would look like this:
str_replace("\\u00a0", "",$string);
回答6:
This one also works, i found somewhere
$str = trim($str, chr(0xC2).chr(0xA0));
回答7:
This did the trick for me:
$str = preg_replace( "~\x{00a0}~siu", " ", $str );
来源:https://stackoverflow.com/questions/2592502/i-have-a-string-with-u00a0-and-i-need-to-replace-it-with-str-replace-fail