Strange behaviour with Boost regex and emojis

巧了我就是萌 提交于 2019-12-11 18:28:26

问题


The emoji sequence "👩‍❤️‍💋‍👩" does not seem to be rendered as a sequence on notepad++ and regex101.com (It is rendered as 4-7 characters depending if you count the joiners)

Anyways, I would expect that both .* and (?:.)* would behave the same way, but that does not seem to be the case with notepad++ (It works as expected on regex101)

On notepad++ .* seems to match all characters, but (?:.)* does not.

For example, given this input:

foobar👩‍❤️‍💋‍👩

.* will match everything (foobar👩‍❤️‍💋‍👩)

(?:.)* will match foobar foollowed by several zero-width matches between the emojis.

Why I am getting those results? Is it a bug on the Boost regex engine? Or the main reason behind is notepad++ not displaying correctly the emoji sequence? In that case, why does it work on regex101?

Edit:

[\s\S]* does not seem to work also. It behaves the same way as (?:.)*

来源:https://stackoverflow.com/questions/55765464/strange-behaviour-with-boost-regex-and-emojis

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!