Regex pattern using w.* not matching text starting with foreign characters such as Ä

拜拜、爱过 提交于 2019-12-04 05:39:51

If you do want to match umlauts, then add the regex /u modifier, or use \pL in place of \w. That will allow the regex to match letters outside of the ASCII range.

Reference: http://www.regular-expressions.info/unicode.html
and http://php.net/manual/en/regexp.reference.unicode.php

Ä is a German Umlaut if I am not mistaken. \w Matches (in most flavors) [a-zA-Z0-9_].

You will need to match the unicode range of characters that you want.

\x{00C4} (php) equals the character you want. You will probably need to create a character class to support your unicode characters.

you may have to switch to using unicode chars...

like for ascii you would use [\u0021-\u007e] In this case... the maybe [\u0021-\u007e\u0192-\u687]

I'm not quite sure on what range of characters you want but the \w I think only match things in the normal asci range

Consider using:

/(\d+)\n((\p{L}|\p{N}|_).*)\n(\d{3}\.\d{3}\.\d{2})\n(\d.*)\n(\d.*)/
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!