Regular text file to XML using XSLT

后端 未结 2 488
囚心锁ツ
囚心锁ツ 2020-12-15 11:17

I have a text file which looks like that:

XXX^YYYY^AAAAA^XXXXXX^AAAAAA....

Fields are separated using a caret(^), my presumptions are:

相关标签:
2条回答
  • 2020-12-15 11:50

    Tokenizing and sorting with XSLT 1.0

    If you use xslt 2.0 it's much simpler: fn:tokenize(string,pattern)

    Example: tokenize("XPath is fun", "\s+")
    Result: ("XPath", "is", "fun")
    
    0 讨论(0)
  • 2020-12-15 12:04

    Text (non-XML) files can be read with the standard XSLT 2.0 function unparsed-text().

    Then one can use the standard XPath 2.0 function tokenize() and two other standard XPath 2.0 functions that accept regular a expression as one of their arguments -- matches() and replace().

    XSLT 2.0 has its own powerful instructions to handle text processing using regular expressions:: the <xsl:analyze-string>, the <xsl:matching-substring> and the <xsl:non-matching-substring> instruction.

    See some of the more powerful capabilities of XSLT text processing with these functions and instructions in this real-world example: an XSLT solution to the WideFinder problem.

    Finally, here is an XSLT 1.0 solution:

    <xsl:stylesheet version="1.0"
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     xmlns:ext="http://exslt.org/common"
     xmlns:my="my:my" exclude-result-prefixes="ext my">
     <xsl:output omit-xml-declaration="yes" indent="yes"/>
    
     <my:fieldNames>
      <name>FirstName</name>
      <name>LastName</name>
      <name>City</name>
      <name>State</name>
      <name>Zip</name>
     </my:fieldNames>
    
     <xsl:variable name="vfieldNames" select=
      "document('')/*/my:fieldNames"/>
    
     <xsl:template match="/">
      <xsl:variable name="vrtfTokens">
       <xsl:apply-templates/>
      </xsl:variable>
    
      <xsl:variable name="vTokens" select=
           "ext:node-set($vrtfTokens)"/>
    
      <results>
       <xsl:apply-templates select="$vTokens/*"/>
      </results>
     </xsl:template>
    
     <xsl:template match="text()" name="tokenize">
      <xsl:param name="pText" select="."/>
    
         <xsl:if test="string-length($pText)">
           <xsl:variable name="vWord" select=
           "substring-before(concat($pText, '^'),'^')"/>
    
           <word>
            <xsl:value-of select="$vWord"/>
           </word>
    
           <xsl:call-template name="tokenize">
            <xsl:with-param name="pText" select=
             "substring-after($pText,'^')"/>
           </xsl:call-template>
         </xsl:if>
     </xsl:template>
    
     <xsl:template match="word">
      <xsl:variable name="vPos" select="position()"/>
    
      <field>
          <xsl:element name="{$vfieldNames/*[position()=$vPos]}">
          </xsl:element>
          <value><xsl:value-of select="."/></value>
      </field>
     </xsl:template>
    </xsl:stylesheet>
    

    When this transformation is applied to the following XML document:

    <t>John^Smith^Bellevue^WA^98004</t>
    

    the wanted, correct result is produced:

    <results>
       <field>
          <FirstName/>
          <value>John</value>
       </field>
       <field>
          <LastName/>
          <value>Smith</value>
       </field>
       <field>
          <City/>
          <value>Bellevue</value>
       </field>
       <field>
          <State/>
          <value>WA</value>
       </field>
       <field>
          <Zip/>
          <value>98004</value>
       </field>
    </results>
    
    0 讨论(0)
提交回复
热议问题