Strip all HTML tags except links

后端 未结 6 933
小蘑菇
小蘑菇 2020-11-29 03:29

I am trying to write a regular expression to strip all HTML with the exception of links (the and tags respectively. It does n

6条回答
  •  囚心锁ツ
    2020-11-29 03:56

    In general there are problems with this approach. Regexes are best for 'flat' text matches - nested data pushes regex engines into areas for which they are not designed. General HTML parsing needs a parser not a regex engine (Google for the difference between regular and context-free languages if you want the full technical details).

    It is easy to strip out all tags by replacing // with the empty string or their entity equivalents but selectively filtering HTML using regexes will be vulnerable to a wide range of accidental or malicious inputs breaking things.

提交回复
热议问题