How can I parse nested blocks using Regex? [duplicate]

依然范特西╮ 提交于 2019-12-11 17:09:30

问题


Possible Duplicates:
RegEx match open tags except XHTML self-contained tags
.NET Regex balancing groups expression - matching when not balanced

For example, if I had the input:

[quote]He said:
    [quote]I have no idea![/quote]
But I disagree![/quote]

And another quote:

[quote]Some other quote here.[/quote]

How can I effectively grab blocks of quotes using regular expressions without grabbing too much or too little? For example, if I use:

\[Quote\](.+)\[/Quote\]

This will grab too much (basically, the entire thing), whereas this:

\[Quote\](.+?)\[/Quote\]

will grab too little (it will only grab [quote]He said:[quote]I have no idea![/quote], with mismatching start/end braces).

So how can I effectively parse nested blocks of code like this using Regex?


回答1:


Regexes and nesting do not work well toghether. It's possible (but, depending on the regex dialect you're using, potentially very cumbersome) to construct a regex that matches only an innermost pair. However, if you want to match an entire quote with nested quotes inside, then regular expressions are simply not a strong enough tool. You'll need to look into context-free parser technology, or do successive replaces to rewrite the nested quotes to something else before matching the outer ones.




回答2:


Take a look at my xml indenter, it uses groups to match beginning tag to the last tag, and another group to get the content recursively.



来源:https://stackoverflow.com/questions/7151468/how-can-i-parse-nested-blocks-using-regex

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!