Regex to match all HTML tags except

and

后端 未结 13 725
抹茶落季
抹茶落季 2020-11-30 06:31

I need to match and remove all tags using a regular expression in Perl. I have the following:

<\\\\??(?!p).+?>

But this still matche

13条回答
  •  独厮守ぢ
    2020-11-30 07:16

    I came up with this:

    <(?!\/?p(?=>|\s.*>))\/?.*?>
    
    x/
    <           # Match open angle bracket
    (?!         # Negative lookahead (Not matching and not consuming)
        \/?     # 0 or 1 /
        p           # p
        (?=     # Positive lookahead (Matching and not consuming)
        >       # > - No attributes
            |       # or
        \s      # whitespace
        .*      # anything up to 
        >       # close angle brackets - with attributes
        )           # close positive lookahead
    )           # close negative lookahead
                # if we have got this far then we don't match
                # a p tag or closing p tag
                # with or without attributes
    \/?         # optional close tag symbol (/)
    .*?         # and anything up to
    >           # first closing tag
    /
    

    This will now deal with p tags with or without attributes and the closing p tags, but will match pre and similar tags, with or without attributes.

    It doesn't strip out attributes, but my source data does not put them in. I may change this later to do this, but this will suffice for now.

提交回复
热议问题