How do I parse HTML with Perl?

问题

I'm new to programming and learning Perl as well.

Here is my question: How can I parse the data below in Perl using Perl modules?

<h4>This is the line</h4>
abc : 130.65 TB<br>
dif : 74.52 TB<br>
asw : 56.13 TB<br>
qwe : 57<br>

This is the sample data from a webpage and I want an output like

abc = 130.65 TB
dif = 74.52 TB
asw = 56.13 TB
qwe = 57

Can anyone please help me?

回答1:

Use an HTML parsing module like HTML::Parser or HTML::TreeBuilder.

If you are just trying to extract the text and strip all the tags, then it should be as simple as:

    use HTML::TreeBuilder;
    my $tree = HTML::TreeBuilder->new();
    $tree->parse( $YOUR_HTML_TEXT );
    $tree->eof();
    my $just_the_text = $tree->as_text();
    $tree->delete;

You can also check http://htmlparsing.com/perl.html for more on parsing HTML with Perl.

回答2:

You can also use HTML::Tokeparser. But if you prefer work with DOM model try Mojo::DOM

来源：https://stackoverflow.com/questions/14051191/how-do-i-parse-html-with-perl

标签

perl

html-parsing

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!