问题
My team uses an XML editor, MadCap Flare, to write technical documentation, with both PDF and HTML outputs. We need to use H2s for some pages to correctly format the PDFs, but for SEO purposes, we need to convert those to H1s for web. I have written a build event that converts H2s to H1s upon web publish. However, I have just realized that the XSL code incorrectly strips spaces from between variables and images. I discovered xsl:preserve-space, but using this breaks the rest of the code so that H2s are never converted to H1s. I need to find a way to both perform the conversion and preserve the space.
Here is a snippet of the source HTM (and before you ask, no, I can't remove the span tags; they're inserted by Flare when it converts the variables to text):
<div role="main" id="mc-main-content">
<h2><span class="GlobalCompany">BeyondTrust</span> <span class="ProductsPA">Privileged Remote Access</span> Web Rep Console Requirements</h2>
<p>To run the <span class="GlobalCompany">BeyondTrust</span> <span class="ProductNamesWebConsole">web rep console</span> on your system...</p>
Here's the batch file I use as the build event:
@ECHO Off
set outputDir=%1
@set XSLAltova=C:\Users\%username%\AltovaXML.exe
REM Create filelist
dir %outputDir%*.htm /b /s /A-D > file_list.txt
@echo ^<filelist^>^</filelist^> > pre_filelist.xml
REM XML-ize filelist
%XSLAltova% /xslt2 convert_filelist.xsl /in pre_filelist.xml /out pre_list.xml
REM Replace starting h2 tags with h1 tags
%XSLAltova% /xslt2 h2toh1.xsl /in pre_list.xml /out null.xml
REM Garbage collection
DEL pre_list.xml
DEL pre_filelist.xml
DEL file_list.txt
Here's convert_filelist.xsl
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<!-- Set output style. XML with no indentations -->
<xsl:output indent="no" method="xml" omit-xml-declaration="yes"/>
<!-- Reads the file list text file into memory as a global variable. -->
<xsl:variable name="fileList">file_list.txt</xsl:variable>
<!-- Parses the file list text file to create an XML list of files that can be fed to the transformer -->
<xsl:template match="filelist">
<!-- Create a variable that can be parsed -->
<xsl:variable name="filelist_raw"><xsl:value-of select="unparsed-text($fileList,'UTF-8')"/></xsl:variable>
<!-- Create a open and close file tags for each line in the list -->
<xsl:variable name="driveLetter"><xsl:value-of select="substring-before(unparsed-text($fileList,'UTF-8'),':')"/>:<xsl:text disable-output-escaping="yes">\\</xsl:text></xsl:variable>
<xsl:variable name="driveLetterReplacement"><xsl:text disable-output-escaping="yes"><file></xsl:text><xsl:value-of select="$driveLetter"/></xsl:variable>
<!-- Generate an xml tree. The value-of is doing a text-level replacement. Looking for the drive letter and replacing it -->
<!-- with the file open tag and drive letter. Looking for the file extension and replacing with the extension and file close tag.-->
<file_list><xsl:value-of select="replace(replace($filelist_raw,'.htm','.htm</file>'),$driveLetter,$driveLetterReplacement)" disable-output-escaping="yes"/></file_list>
</xsl:template>
</xsl:stylesheet>
And here's h2toh1.xsl:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0"
xmlns:MadCap="http://www.madcapsoftware.com/Schemas/MadCap.xsd"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<!-- Set output style. -->
<xsl:output method="xml" indent="yes" omit-xml-declaration="no"/>
<xsl:preserve-space elements="node()"/>
<!-- Begin traversing the list of files in the output folder. -->
<xsl:template match="file_list">
<xsl:for-each select="*">
<xsl:variable name="filename" select="."/>
<xsl:variable name="content" select="document($filename)"/>
<!-- Generate a new output file to replace the Flare generated file. Uses the same file name. Transparent to the end user. -->
<xsl:result-document href="{$filename}" method="html">
<xsl:apply-templates select="document($filename)">
<xsl:with-param name="content" select="$content"/>
</xsl:apply-templates>
</xsl:result-document>
</xsl:for-each>
</xsl:template>
<!-- Recreate each node as it appears in the generated document -->
<xsl:template match="*">
<xsl:param name="content"/>
<xsl:variable name="name" select="name(.)"/>
<xsl:element name="{$name}">
<xsl:for-each select="@*">
<xsl:copy-of select="."/>
</xsl:for-each>
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
<!-- Select the first header and change it to an h1. -->
<xsl:template match="*[matches(name(), 'h\d')][1]">
<xsl:element name="h1">
<xsl:for-each select="@*|node()">
<xsl:copy-of select="."/>
</xsl:for-each>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
Here is the output without xsl:preserve-space: (h1 style, missing spaces between variables)
Here's the output with xsl:preserve-space: (h2 style, ugly blue for contrast, with spaces)
And here's the output I want but can't have: (h1 style, with spaces)
As it stands, my site is somewhat broken, and I don't have a ready means of fixing it without undoing a ton of work. Any help would be most appreciated.
回答1:
The reason your xsl:preserve-space
declaration breaks the code is that you have given it an invalid value; the value of the elements
attribute should be a list of element names or "*". But it's not going to be useful anyway; preserving whitespace is the default, and the only reason to use xsl:preserve-space
is to counteract a general xsl:strip-space
.
You appear to be using the Altova XSLT processor, which I suspect means you are using the Microsoft MSXML parser, which IIRC strips whitespace by default during XML parsing. If that's the case then nothing you do in the stylesheet itself will affect this, as the whitespace has already gone before XSLT processing starts. However, I'm no expert on this product combination so I might be wrong.
Incidentally, it's probably nothing to do with your problem, but the effect of setting disable-output-escaping
while writing nodes to an xsl:variable
is highly unpredictable. The W3C working group changed its mind several times over the years as to whether this should work (it's known as the "sticky-doe problem") and what a particular product does with it is anyone's guess.
来源:https://stackoverflow.com/questions/59940101/xslpreserve-space-breaks-the-rest-of-the-code