Retaining entity in xslt stylesheet output without using character-map

久未见 提交于 2020-06-17 13:12:25

问题


Where did we go wrong?

When I process this xml with xslt 2 on saxon he:

 <data>
      <grab>Grab me and print me back &quot;</grab>
 </data>

using this stylesheet:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
    <xsl:template match="/">
        <xsl:apply-templates select="/data/grab"/>
    </xsl:template>

    <xsl:template match="/data/grab">
        <node><xsl:value-of select="text()"/></node>
    </xsl:template>

</xsl:stylesheet>

I get this output:

<?xml version="1.0" encoding="UTF-8"?><node>Grab me and print me back "</node>

But I want to retain the &quot; in the outputted xml. Therefore we needed to add a character-map:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
    <xsl:character-map name="specialchar">
        <xsl:output-character character="&quot;" string="&amp;quot;"/>
    </xsl:character-map>
    <xsl:output method="xml" indent="no"  use-character-maps="specialchar"/>
    <xsl:template match="/">
        <xsl:apply-templates select="/data/grab"/>
    </xsl:template>

    <xsl:template match="/data/grab">
        <node><xsl:value-of select="text()"/></node>
    </xsl:template>

</xsl:stylesheet>

Which retains the &quot; entity... which, imho, looks verbose and ugly,

Is this really necessary? Is there not a more elegant alternative? If not, what is the rationale behind this?


回答1:


Architecturally, XSLT transforms XDM trees to XDM trees, it does not transform lexical XML to lexical XML. XDM trees do not distinguish between &quot; and ", any more than they distinguish between <a id="5"/> and <a id = '5'></a>. The fact that arbitrary and irrelevant differences in the way you write the XML are hidden from the XSLT programmer is very much by design, and makes it much easier to write correct transformations.

Now there are certainly use cases for preserving entity references: particularly semantic entity references like &author; that might take different values on different occasions. But entity references aren't a particularly good solution to that requirement; XInclude is usually better. And the argument doesn't apply to character references like &quot;: it's really hard to see a good use case for treating &quot; and " differently, and you certainly haven't provided one.

At a practical level, Saxon couldn't preserve the &quot; even if it wanted to, because it doesn't know it's there: the XML parser (which converts lexical XML to XDM) doesn't notify character references to the application. Again, that's by design: the theory is that applications shouldn't know and shouldn't care. And it has the great virtue that we don't get zillions of SO questions from application developers who failed to cater for the possibility.



来源:https://stackoverflow.com/questions/62286935/retaining-entity-in-xslt-stylesheet-output-without-using-character-map

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!