Regex, match newlines between tags

僤鯓⒐⒋嵵緔 提交于 2021-02-05 12:13:01

问题


I have this regex in PHP:

preg_match('/\[summary\](.+)\[\/summary\]/i', $data['text'], $match);

It works fine when the text between the summary tags is on one line. However, when it contains newlines, it doesn't match.

I've tried to find a correct modifier here: http://nl2.php.net/manual/en/reference.pcre.pattern.modifiers.php But the only one related to newlines is "m" and that doesn't do what I want.

How to make this work?


回答1:


The man page you've linked to describes another options that has an effect on how line breaks are handled.

s (PCRE_DOTALL) If this modifier is set, a dot metacharacter in the pattern matches all characters, including newlines. Without it, newlines are excluded. This modifier is equivalent to Perl's /s modifier. A negative class such as [^a] always matches a newline character, independent of the setting of this modifier.



回答2:


Regexes are fundamentally bad at parsing HTML (see Can you provide some examples of why it is hard to parse XML and HTML with a regex? for why). What you need is an HTML parser. See Can you provide an example of parsing HTML with your favorite parser? for examples using a variety of parsers.

You may find this answer that uses SimpleXML helpful.



来源:https://stackoverflow.com/questions/1006265/regex-match-newlines-between-tags

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!