How can I preserve HTML entities with Diazo?

人走茶凉 提交于 2020-01-06 06:49:07


I have the following simple Diazo rules file:


  <theme href="theme/theme.html" />

  <replace css:theme-children="#content" css:content-children=".content" />


and theme:

    <div id="content">
        Lorem ipsum ...

The source I want to transform is:

    <div class="content">
        <a href="&#0109;&#0097;&#0105;lt&#0111;&#0058;info&#0064;example&#46;org">info</a>

What I get is

... <a href="">info</a> ...

but I want to keep the HTML entities of the href attribute intact. How can I do this with Diazo?


Note numeric character references are not entity references so your title is a bit misleading (the answer for preserving or not entity references such as "& n b s p ; " is very different)

I don't know Diazo but in XSLT if you add

 <xsl:output encoding="US-ASCII"/>

to your document then any non ascii characters will be output using numeric references.

However in your example they are in fact ascii characters that are quoted such as "." as "." There isn't any standard way in xslt 1 to do that (and there should never be any reason to do that if the document is going to be processed by a conforming html or xml system). Any such system will expand those references to their characters before processing starts. (Which is why XSLT can not preserve them: they have been removed by the xml parser before XSLT sees the input data.)

