问题
I have this regex in PHP:
preg_match('/\[summary\](.+)\[\/summary\]/i', $data['text'], $match);
It works fine when the text between the summary tags is on one line. However, when it contains newlines, it doesn't match.
I've tried to find a correct modifier here: http://nl2.php.net/manual/en/reference.pcre.pattern.modifiers.php But the only one related to newlines is "m" and that doesn't do what I want.
How to make this work?
回答1:
The man page you've linked to describes another options that has an effect on how line breaks are handled.
s (PCRE_DOTALL) If this modifier is set, a dot metacharacter in the pattern matches all characters, including newlines. Without it, newlines are excluded. This modifier is equivalent to Perl's /s modifier. A negative class such as [^a] always matches a newline character, independent of the setting of this modifier.
回答2:
Regexes are fundamentally bad at parsing HTML (see Can you provide some examples of why it is hard to parse XML and HTML with a regex? for why). What you need is an HTML parser. See Can you provide an example of parsing HTML with your favorite parser? for examples using a variety of parsers.
You may find this answer that uses SimpleXML helpful.
来源:https://stackoverflow.com/questions/1006265/regex-match-newlines-between-tags