Regex: Strip HTML attributes except SRC

前端 未结 6 944
予麋鹿
予麋鹿 2020-12-16 17:13

I\'m trying to write a regular expression that will strip all tag attributes except for the SRC attribute. For example:

6条回答
  •  时光取名叫无心
    2020-12-16 17:52

    Unfortunately I'm not sure how to answer this question for PHP. If I were using Perl I would do the following:

    use strict;
    my $data = q^

    This is a paragraph with an image

    ^; $data =~ s{ <([^/> ]+)([^>]+)> # split into tagtype, attribs }{ my $attribs = $2; my @parts = split( /\s+/, $attribs ); # separate by whitespace @parts = grep { m/^src=/i } @parts; # retain just src tags if ( @parts ) { "<" . join( " ", $1, @parts ) . ">"; } else { "<" . $1 . ">"; } }xseg; print( $data );

    which returns

    This is a paragraph with an image

提交回复
热议问题