Anyone know of any Perl module to escape text in an XML document?
I\'m generating XML which will contain text that was entered by the user. I want to correctly handl
For programs that need to handle every special case, by all means use an official library for this task. However, theoretically there are only 5 characters that need escaping in XML.
So, for one-offs that you don't want to pull in an extra library for, the following perl expression should suffice:
perl -pe 's/\&/\&/g; s/</\</g; s/>/\>/g; s/"/\"/g; s/'"'"'/\'/g'
The XML::Simple escape_value could be used also, but use of XML::Simple is not recommended for new programs. See this post post 17436965.
A manual escape could be done using regex (copied from escape_value):
$data =~ s/&/&/sg;
$data =~ s/</</sg;
$data =~ s/>/>/sg;
$data =~ s/"/"/sg;
XML::Entities:
use XML::Entities;
my $a_encoded = XML::Entities::numify('all', $a);
Edit: XML::Entities only numifies HTML entities. Use HTML::Entities encode_entities($a) instead
Use XML::Code.
From CPAN
XML::code escape()
Normally any content of the node will be escaped during rendering (i. e. special symbols like '&' will be replaced by corresponding entities). Call escape() with zero argument to prevent it:
my $p = XML::Code->('p');
$p->set_text ("—");
$p->escape (0);
print $p->code(); # prints <p>—</p>
$p->escape (1);
print $p->code(); # prints <p>&#8212;</p>
Use
XML::Generator
require XML::Generator;
my $xml = XML::Generator->new( ':pretty', escape => 'always,apos' );
print $xml->h1( " &< >non-html plain text< >&" );
which will print all content inside the tags escaped (no conflicts with the markup).
I personally prefer XML::LibXML - Perl binding for libxml. One of the pros - it uses one of the fastest XML processing library available. Here is an example for creating text node:
use XML::LibXML;
my $doc = XML::LibXML::Document->new('1.0',$some_encoding);
my $element = $doc->createElement($name);
$element->appendText($text);
$xml_fragment = $element->toString();
$xml_document = $doc->toString();
And, never, ever create XML by hand. It's gonna be bad for your health when people find out what you did.