strip HTML Tags with perl

前端 未结 5 738
粉色の甜心
粉色の甜心 2020-12-17 03:02

Whats the easiest way to strip the HTML tags in perl. I am using a regular expression to parse HTML from a URL which works great but how can I strip the HTML tags off?

5条回答
  •  一向
    一向 (楼主)
    2020-12-17 03:35

    Have a look at the HTML::Restrict module which allows you to strip away or restrict the HTML tags allowed. A minimal example that strips away all HTML tags:

    use HTML::Restrict;
    
    my $hr = HTML::Restrict->new();
    my $processed = $hr->process('i am bold'); # returns 'i am bold'
    

    I would recommend to stay away from HTML::Strip because it breaks utf8 encoding.

提交回复
热议问题