Split string based on regex but keep delimiters

感情迁移 提交于 2019-12-02 04:06:22

Well, you can use lookaround to split at points between characters without consuming the delimiters:

(?<=[()>*-;\s])|(?=[()>*-;\s])

This will create a split point before and after each delimiter character. You might need to remove superfluous whitespace elements from the resulting array, though.

Quick PowerShell test (| marks the split points):

PS Home:\> 'if (x>1) return x * fact(x-1);' -split '(?<=[()>*-;\s])|(?=[()>*-;\s])' -join '|'
if| |(|x|>|1|)| |return| |x| |*| |fact|(|x|-|1|)|;|

How about this pattern?

(\w+)|([\p{P}\p{S}])

To answer your question, "Why?", it's because your entire expression is a lookahead assertion. As long as that assertion is true at each character (or maybe I should say "between"), it is able to split.

Also, you cannot group within character classes, e.g. (<=) is not doing what you think it is doing.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!