Does preg_replace() change my character set?

别等时光非礼了梦想. 提交于 2019-12-09 11:34:43

问题


I have the following piece of code which seems to be changing my character set.

     $html = "à";
     echo $html;  // result: à
     $html = preg_replace("/\s/", "", $html);
     echo $html;  // result: ?

However, when I use [\t\n\r\f\v] as my pattern instead of the special character \s it works fine:

     $html = "à";
     echo $html;  // result: à
     $html = preg_replace("/[\t\n\r\f\v]/", "", $html);
     echo $html;  // result: à

Why is that?


回答1:


I have the same problem. It is because of UTF8.

à is 0xc3a0 in UTF8. In PHP you can write like this: "\xc3\xa0".

With PCRE the /s match 0xa0 like it was ASCII "Non-breaking space".

You can use the u flag to resolve the problem.

$html = preg_replace("/\s/u", "", $html);


来源:https://stackoverflow.com/questions/19629893/does-preg-replace-change-my-character-set

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!