PHP RegEx remove empty paragraph tags

两盒软妹~` 提交于 2020-01-03 08:44:11

问题


I'm trying to remove all empty <p> tags CKEditor is inserting in to a description box but they all seem to vary. The possibilities seem to be:

<p></p>

<p>(WHITESPACE)</p>

<p>&nbsp;</p>

<p><br /></p>

<p>(NEWLINE)&nbsp;</p>

<p>(NEWLINE)<br /><br />(NEWLINE)&nbsp;</p>

With these possibilities, there could be any amount of whitespace, &nbsp; and <br /> tags in between the paragraphs, and there could be some of each kind in one paragraph.

I'm also not sure about the <br /> tag, from what I've seen it could be <br />, <br/> or <br>.

I've searched SO for a similar answer but of all the answers I've seen they all seem to cater for just one of these cases, not all at once. I guess in simple terms what I'm asking is, Is there a regular expression I can use to remove all <p> tags from some HTML that don't have any alphanumeric text or symbols/punctuation in them?


回答1:


Well, in conflict with my suggestion not to parse HTML with regexes, I wrote up a regex to do just that:

"#<p>(\s|&nbsp;|</?\s?br\s?/?>)*</?p>#"

This will match properly for:

<p></p>

<p> </p> <!-- ([space]) -->

<p> </p> <!-- (That's a [tab] character in there -->

<p>&nbsp;</p>

<p><br /></p>

<p>
&nbsp;</p>

<p>
<br /><br />
&nbsp;</p>

What it does:

# /                --> Regex start
# <p>              --> match the opening <p> tag
# (                --> group open.
#   \s             --> match any whitespace character (newline, space, tab)
# |                --> or
#   &nbsp;         --> match &nbsp;
# |                --> or
#   </?\s?br\s?/?> --> match the <br> tag
# )*               --> group close, match any number of any of the elements in the group
# </?p>            --> match the closing </p> tag ("/" optional)
# /                --> regex end.



回答2:


The selected answer is great, but it doesn't work if <p> tag has inline style attributes defined, like <p style="font-weight:bold">.

A regex to match this, would be:

#<p[^>]*>(\s|&nbsp;|</?\s?br\s?/?>)*</?p>#


来源:https://stackoverflow.com/questions/14260670/php-regex-remove-empty-paragraph-tags

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!