Removing line breaks and broken entities using XSLT

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-13 09:16:19

问题


My XML is being generated from a web form and some users are inserting line breaks and characters that being converted to line breaks \n and broken entities, like &

I'm using some variables to convert and remove bad characters, but I don't know how to strip out these types of characters.

Here's the method I'm using to convert or strip out other bad characters. Let me know if you need to see the entire XSL. …

<xsl:variable name="smallcase" select="'abcdefghijklmnopqrstuvwxyz_aaea'" />
<xsl:variable name="uppercase" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ äãêÂ.,'" />
<xsl:variable name="linebreaks" select="'\n'" />
<xsl:variable name="nolinebreaks" select="' '" />

<xsl:value-of select="translate(Surname, $uppercase, $smallcase)"/>
<xsl:value-of select="translate(normalize-space(Office_photos), $linebreaks, $nolinebreaks)"/>

The text in the XML contains content like this:

<Office_photos>bn_1.jpg: Showing a little Red Sox Pride!&#13;\nLeft to right: 
 Tessa Michelle Summers, \nJulie Gross, Alexis Drzewiecki</Office_photos>

I'm trying to get rid of the \n character inside the data


回答1:


As Lingamurthy CS explains in the comments \n is not treated as a single character in XML. It is simply parsed into two characters without any special handling.

If this is literally want you want to change though, then in XSLT 1.0 you will need to use a recursive template to replace the text (XSLT 2.0 has a replace function, XSLT 1.0 doesn't).

A quick search on Stackoverflow finds one such template at XSLT string replace

To call this, instead of doing this....

<xsl:value-of select="translate(normalize-space(Office_photos), $linebreaks, $nolinebreaks)"/>

You would just do this

  <xsl:call-template name="string-replace-all">
     <xsl:with-param name="text" select="Office_photos" />
     <xsl:with-param name="replace" select="$linebreaks" />
     <xsl:with-param name="by" select="$nolinebreaks" /> 
  </xsl:call-template>

Try this XSLT

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   <xsl:output omit-xml-declaration="yes" indent="yes" />

   <xsl:variable name="linebreaks" select="'\n'" />
   <xsl:variable name="nolinebreaks" select="' '" />

   <xsl:template match="/">
      <xsl:call-template name="string-replace-all">
         <xsl:with-param name="text" select="Office_photos" />
         <xsl:with-param name="replace" select="$linebreaks" />
         <xsl:with-param name="by" select="$nolinebreaks" /> 
      </xsl:call-template>
   </xsl:template>

   <xsl:template name="string-replace-all">
     <xsl:param name="text" />
     <xsl:param name="replace" />
     <xsl:param name="by" />
     <xsl:choose>
       <xsl:when test="contains($text, $replace)">
         <xsl:value-of select="substring-before($text,$replace)" />
         <xsl:value-of select="$by" />
         <xsl:call-template name="string-replace-all">
           <xsl:with-param name="text" select="substring-after($text,$replace)" />
           <xsl:with-param name="replace" select="$replace" />
           <xsl:with-param name="by" select="$by" />
         </xsl:call-template>
       </xsl:when>
       <xsl:otherwise>
         <xsl:value-of select="$text" />
       </xsl:otherwise>
     </xsl:choose>
   </xsl:template>
</xsl:stylesheet>

(Credit to Mark Elliot who created the replace template)



来源:https://stackoverflow.com/questions/22018768/removing-line-breaks-and-broken-entities-using-xslt

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!