Get value after each last colon

♀尐吖头ヾ 提交于 2020-01-11 13:07:08

问题


I need to get the value of each data after the last colon. For example, I have this file:

<Data>
:20:PmtReferenceID000012
:21:Not used
:25: PHMNLBICXXX/Account00010203
:28c:00001/0001 (The 'c' in :28 can be either in upper or lower case)

:20:PmtReferenceID000012
:21:Not used
:25: PHMNLBICXXX/Account00010203
:28c:00001/0001 (The 'c' in :28 can be either in upper or lower case)
</Data>

I need to store the value after the ':20:' to <ABCD>, ':21:' to <EFGH>, ':25:' to <IJKL> and ':28c:' to <MNOP>.

Here is my XSLT:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>
<xsl:template match="Data">
    <Data>
        <xsl:variable name="OneLine" select="replace(translate(.,'&#10;', '|'),'&#xD;','')"/>
        <ABCD>
            <xsl:value-of select="substring-before(substring-after($OneLine, ':20:'),'|:')"/>
        </ABCD>
        <EFGH>
            <xsl:value-of select="substring-before(substring-after($OneLine, ':21:'),'|:')"/>
        </EFGH>
        <IJKL>
            <xsl:value-of select="substring-before(substring-after($OneLine, ':25:'),'|:')"/>
        </IJKL>
        <MNOP>
            <xsl:value-of select="substring-before(substring-after($OneLine, ':28c:'),'|:')"/>
        </MNOP>
    </Data>
</xsl:template>

Expected output:

<Data>
   <ABCD>PmtReferenceID000012</ABCD>
   <EFGH>Not used</EFGH>
   <IJKL> PHMNLBICXXX/Account00010203</IJKL>
   <MNOP>00001/0001</MNOP>
</Data>
<Data>
   <ABCD>PmtReferenceID000012</ABCD>
   <EFGH>Not used</EFGH>
   <IJKL> PHMNLBICXXX/Account00010203</IJKL>
   <MNOP>00001/0001</MNOP>
</Data>

What I did is, I replace first the carriage return to pipe ('|'), so that, if I get the value for example the ':20:', I will look for the '|' and substring the value after the ':20:' and before the '|'. Is there an easy way on will I get the value after each last colon because there's so many keys, if I'm going to use the method that I did? I'm thinking of using an index or position, and store all the keys (:20:,:21:,:25:,:28c'), so that if the next record contains ':21:' or ':25:' or ':28c', it will get the value before that key. But, I don't have any idea on how will I do that using xslt.

Your feedback is highly appreciated!

Thanks,


回答1:


I wrote this answer to your original post but didn't post it because it was essentially similar to the one posted by zx485.

However, I still recommend using a key to retrieve the corresponding element name (and I also think the regex can be simpler and more robust).

I have added a tokenizing step to split the data into separate <Data> wrappers on every double line-feed character.

XSLT 2.0

<xsl:stylesheet version="2.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<!-- identity transform -->
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<xsl:variable name="map">
    <name key="20">ABCD</name>
    <name key="21">EFGH</name>
    <name key="25">IJKL</name>
    <name key="28C">MNOP</name>
</xsl:variable>

<xsl:key name="nm" match="name" use="@key" />

<xsl:template match="Data">
    <xsl:for-each select="tokenize(., '\n\n')">
        <Data>
            <xsl:analyze-string select="." regex="^:([^:]*):(.*)$" flags="m">
                <xsl:matching-substring>
                    <xsl:element name="{key('nm', upper-case(regex-group(1)), $map)}">
                        <xsl:value-of select="regex-group(2)" />
                    </xsl:element>
                </xsl:matching-substring>
            </xsl:analyze-string>
        </Data>
    </xsl:for-each>
</xsl:template>

</xsl:stylesheet>

Demo: http://xsltransform.net/ehVYZNm




回答2:


In XSLT 3.0 you could write templates for the different strings e.g.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:math="http://www.w3.org/2005/xpath-functions/math" exclude-result-prefixes="xs math"
    version="3.0">

    <xsl:output indent="yes"/>

    <xsl:template match="Data">
        <xsl:copy>
            <xsl:apply-templates select="tokenize(., '\r?\n')[normalize-space()]"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match=".[. instance of xs:string and matches(., '^:20:')]">
        <ABCD>
            <xsl:value-of select="replace(., '^:20:', '')"/>
        </ABCD>
    </xsl:template>

    <xsl:template match=".[. instance of xs:string and matches(., '^:21:')]">
        <EFGH>
            <xsl:value-of select="replace(., '^:21:', '')"/>
        </EFGH>
    </xsl:template>

    <xsl:template match=".[. instance of xs:string and matches(., '^:25:')]">
        <IJKL>
            <xsl:value-of select="replace(., '^:25:', '')"/>
        </IJKL>
    </xsl:template>

    <xsl:template match=".[. instance of xs:string and matches(., '^:28c:', 'i')]">
        <MNOP>
            <xsl:value-of select="replace(., '^:28c:', '', 'i')"/>
        </MNOP>
    </xsl:template>    
</xsl:stylesheet>

With Saxon 9.8 or Altova XMLSpy/Raptor that does the job and outputs

<Data>
   <ABCD>PmtReferenceID000012</ABCD>
   <EFGH>Not used</EFGH>
   <IJKL> PHMNLBICXXX/Account00010203</IJKL>
   <MNOP>00001/0001</MNOP>
</Data>

(for the input

<Data>
:20:PmtReferenceID000012
:21:Not used
:25: PHMNLBICXXX/Account00010203
:28c:00001/0001
</Data>

)

As an alternative, instead of tokenizing and processing strings you could use the analyze-string function and match on the returned fn:match elements:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:fn="http://www.w3.org/2005/xpath-functions"
    xmlns:math="http://www.w3.org/2005/xpath-functions/math"
    exclude-result-prefixes="xs fn math"
    version="3.0">

    <xsl:output indent="yes"/>

    <xsl:template match="Data">
        <xsl:copy>
            <xsl:apply-templates select="analyze-string(., '^(:[0-9]+[a-z]*:)(.*)\r?\n', 'im')//fn:match"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="fn:match[fn:group[@nr = 1][. = ':20:']]">
        <ABCD>
            <xsl:value-of select="fn:group[@nr = 2]"/>
        </ABCD>
    </xsl:template>

    <xsl:template match="fn:match[fn:group[@nr = 1][. = ':21:']]">
        <EFGH>
            <xsl:value-of select="fn:group[@nr = 2]"/>
        </EFGH>
    </xsl:template>

    <xsl:template match="fn:match[fn:group[@nr = 1][. = ':25:']]">
        <IJKL>
            <xsl:value-of select="fn:group[@nr = 2]"/>
        </IJKL>
    </xsl:template>

    <xsl:template match="fn:match[fn:group[@nr = 1][matches(., '^:28c:', 'i')]]">
        <MNOP>
            <xsl:value-of select="fn:group[@nr = 2]"/>
        </MNOP>
    </xsl:template>

</xsl:stylesheet>

Finally, taking up the idea of a map parameter to define the element names the second solution can be shortened to

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:fn="http://www.w3.org/2005/xpath-functions"
    xmlns:math="http://www.w3.org/2005/xpath-functions/math"
    exclude-result-prefixes="xs fn math"
    version="3.0">

    <xsl:param name="map" as="map(xs:string, xs:string)"
        select="map {
                  '20' : 'ABCD',
                  '21' : 'EFGH',
                  '25' : 'IJKL',
                  '28c' : 'MNOP'
                }"/>

    <xsl:output indent="yes"/>

    <xsl:template match="Data">
        <xsl:copy>
            <xsl:apply-templates select="analyze-string(., '^(:([0-9]+[a-z]*):)(.*)\r?\n', 'im')//fn:match" mode="wrap"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="match" mode="wrap" xpath-default-namespace="http://www.w3.org/2005/xpath-functions">
        <xsl:element name="{$map(lower-case(.//group[@nr = 2]))}">
            <xsl:value-of select="group[@nr = 3]"/>
        </xsl:element>
    </xsl:template>

</xsl:stylesheet>



回答3:


Is there an easy way on will I get the value after each last colon because there's so many keys[...]

Yes. You can use RegEx matching.
In the following template regex-group(2) contains the string after the second/(last) colon. And regex-group(1) contains the key.

<xsl:template match="Data">
    <Data>
        <xsl:analyze-string select="." regex=":([0-9A-Za-z]+):(.*)\n">
            <xsl:matching-substring>
                (<xsl:value-of select="regex-group(1)" /> --- <xsl:value-of select="regex-group(2)" />)<xsl:text>&#xa;</xsl:text>
            </xsl:matching-substring>
        </xsl:analyze-string>
    </Data>
</xsl:template>

Partial output:

(20 --- PmtReferenceID000012)
(21 --- Not used)
(25 ---  PHMNLBICXXX/Account00010203)
(28c --- 00001/0001 (The 'c' in :28 can be either in upper or lower case))

With that you can create a key/value Dictionary that creates the tags around the text.

Like this:

  • Key: 20, Value: ABCD
  • Key: 21, Value: EFGH
  • ...

For example: you can create a variable inside the XSL file to store the mapping:

<xsl:stylesheet version="2.0" 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
  xmlns:xs="http://www.w3.org/2001/XMLSchema" 
  xmlns:fn="http://www.w3.org/2005/xpath-functions"
  xmlns:map="http://custom.map">
  <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

  <xsl:variable name="Mapping">
    <Map key="20">ABCD</Map>
    <Map key="21">EFGH</Map>
    <Map key="25">IJKL</Map>
    <Map key="28c">MNOP</Map>
  </xsl:variable>

  <xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="Data">
    <Data>
        <xsl:analyze-string select="." regex=":([0-9A-Za-z]+):(.*)\n">
            <xsl:matching-substring>
                <xsl:element name="{$Mapping/Map[@key=regex-group(1)]/text()}"><xsl:value-of select="regex-group(2)" /></xsl:element>
            </xsl:matching-substring>
        </xsl:analyze-string>
    </Data>
  </xsl:template>
</xsl:stylesheet>

Full output:

<?xml version="1.0" encoding="UTF-8"?>
<Data xmlns:xs="http://www.w3.org/2001/XMLSchema"
      xmlns:fn="http://www.w3.org/2005/xpath-functions"
      xmlns:map="http://custom.map">
   <ABCD>PmtReferenceID000012</ABCD>
   <EFGH>Not used</EFGH>
   <IJKL> PHMNLBICXXX/Account00010203</IJKL>
   <MNOP>00001/0001 (The 'c' in :28 can be either in upper or lower case)</MNOP>
</Data>

Or you could literally outsource the mapping and create a separate file for them...



来源:https://stackoverflow.com/questions/45523898/get-value-after-each-last-colon

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!