Getting started with XML::Parser [closed]

余生长醉 提交于 2019-12-11 23:21:34

问题


I've been googling for some time now in order to find information concerning the usage of a Perl-XML-Parser. Being quite a newbie, though, I couldn’t fully understand the documentation or the tutorials.

Just a few words about what I’d need the parser for (nothing exceptional, as you'll see):

I would like to read in an XML-file and transform it — in a first step — into a LaTeX-document. In a second step, I would like to extract certain pieces of information.

For example:

<body>
<head>Title</head>
<poem>
<l>xyz</l>
<l>xyz</l>
</poem>
</body>

This sample-"XML" should be transformed in something like:

\begin{document}
\chapter{Title}
\begin{verse}
xyz\\
xyz
\end{verse}
\end{document}

Furthermore, I would like to put certain pieces of information (e.g. the text between the <l>...</l>-tags) into an array/hash (perhaps together with the number of preceding </l>s)?.

I suppose, tasks like these can very easily be done with a parser. The problem is that I have got only a very vague idea of how to initialize and customize for ex. the XML::Parser module.

I'd be very thankful if anyone could help.


回答1:


Another possibility to handle XML in Perl is XML::XSH2:

use XML::XSH2;
xsh << 'end_xsh';
    open 8023786.xml ;
    cd body ;
    echo '\begin{document}' ;
    for poem {
        echo :s '\chapter{' preceding-sibling::head[1] '}' ;
        echo '\begin{verse}' ;
        for l echo :s text() xsh:if(following-sibling::*, '\\', '') ;
        echo '\end{verse}' ;
    }
    echo '\end{document}' ;
end_xsh



回答2:


The "best" way to transform XML into Latex would be to use XSLT.

STRONG SUGGESTION:

1) Familiarize yourself with basic Perl XML.

Alternatively, use a different language if you feel more comfortable with something else besides Perl - there are good XML libraries available for most languages.

I'd strongly recommend working through all three chapters in this tutorial:

XML For Perl Developers

2) Familiarize yourself with the basics of using XSLT stylesheets. For example:

Investigating XSLT: The XML Transformation Language

3) Investigate some ready-made XML to Latex XSL stylesheets. For example:

XML to LaTeX

... or ...

Transforming XHTML to LaTeX

... or ...

XSLT MathML Library

PS: I hasten to add that the XSLT approach is language- and platform-agnostic. You can use this approach in any language (Perl, Java, Python, etc etc) and on any platform (Windows, Linux, MacOS, etc etc)




回答3:


For complete control over XML translation, implement a finite-state machine using SAX. Perl has XML::SAX with different parser backends (XML::SAX::ExpatXS, XML::LibXML::SAX). Here is one possible solution:

#!/usr/bin/env perl
package XML::SAX::Handler::XML2LaTeX;
use feature qw(say switch);
use strict;
use warnings qw(all);

use base qw(XML::SAX::Base);

sub new {
    return bless {
        data => '',
        line => [],
    } => __PACKAGE__;
}

sub start_element {
    my ($self, $el) = @_;
    $self->{data} = '';
    for ($el->{Name}) {
        when ('body') {
            say '\begin{document}';
        } when ('poem') {
            say '\begin{verse}';
            $self->{line} = [];
        }
    }
    return;
}

sub end_element {
    my ($self, $el) = @_;
    my $data = $self->{data};
    for ($el->{Name}) {
        when ('body') {
            say '\end{document}';
        } when ('head') {
            say "\\chapter{$data}";
        } when ('poem') {
            say join "\\\\\n", @{$self->{line}};
            say '\end{verse}';
        } when ('l') {
            push @{$self->{line}}, $data;
        }
    }
    return;
}

sub characters {
    my ($self, $data) = @_;
    $self->{data} .= $data->{Data};
    return;
}

1;

package main;
use strict;
use warnings qw(all);

use XML::SAX::PurePerl;

my $handler = XML::SAX::Handler::XML2LaTeX->new;
my $parser = XML::SAX::PurePerl->new(Handler => $handler);

$parser->parse_file(\*DATA);

__DATA__
<body>
<head>Title</head>
<poem>
<l>xyz</l>
<l>xyz</l>
</poem>
</body>


来源:https://stackoverflow.com/questions/8023786/getting-started-with-xmlparser

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!