How do I prevent duplicates, in XSL?

和自甴很熟 提交于 2019-12-18 15:52:23

问题


How do I prevent duplicate entries into a list, and then ideally, sort that list? What I'm doing, is when information at one level is missing, taking the information from a level below it, to building the missing list, in the level above. Currently, I have XML similar to this:

<c03 id="ref6488" level="file">
    <did>
        <unittitle>Clinic Building</unittitle>
        <unitdate era="ce" calendar="gregorian">1947</unitdate>
    </did>
    <c04 id="ref34582" level="file">
        <did>
            <container label="Box" type="Box">156</container>
            <container label="Folder" type="Folder">3</container>
        </did>
    </c04>
    <c04 id="ref6540" level="file">
        <did>
            <container label="Box" type="Box">156</container>
            <unittitle>Contact prints</unittitle>
        </did>
    </c04>
    <c04 id="ref6606" level="file">
        <did>
            <container label="Box" type="Box">154</container>
            <unittitle>Negatives</unittitle>
        </did>
    </c04>
</c03>

I then apply the following XSL:

<xsl:template match="c03/did">
    <xsl:choose>
        <xsl:when test="not(container)">
            <did>
                <!-- If no c03 container item is found, look in the c04 level for one -->
                <xsl:if test="../c04/did/container">

                    <!-- If a c04 container item is found, use the info to build a c03 version -->
                    <!-- Skip c03 container item, if still no c04 items found -->
                    <container label="Box" type="Box">

                        <!-- Build container list -->
                        <!-- Test for more than one item, and if so, list them, -->
                        <!-- separated by commas and a space -->
                        <xsl:for-each select="../c04/did">
                            <xsl:if test="position() &gt; 1">, </xsl:if>
                            <xsl:value-of select="container"/>
                        </xsl:for-each>
                    </container>                    
            </did>
        </xsl:when>

        <!-- If there is a c03 container item(s), list it normally -->
        <xsl:otherwise>
            <xsl:copy-of select="."/>
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

But I'm getting the "container" result of

<container label="Box" type="Box">156, 156, 154</container>

when what I want is

<container label="Box" type="Box">154, 156</container>

Below is the full result that I'm trying to get:

<c03 id="ref6488" level="file">
    <did>
        <container label="Box" type="Box">154, 156</container>
        <unittitle>Clinic Building</unittitle>
        <unitdate era="ce" calendar="gregorian">1947</unitdate>
    </did>
    <c04 id="ref34582" level="file">
        <did>
            <container label="Box" type="Box">156</container>
            <container label="Folder" type="Folder">3</container>
        </did>
    </c04>
    <c04 id="ref6540" level="file">
        <did>
            <container label="Box" type="Box">156</container>
            <unittitle>Contact prints</unittitle>
        </did>
    </c04>
    <c04 id="ref6606" level="file">
        <did>
            <container label="Box" type="Box">154</container>
            <unittitle>Negatives</unittitle>
        </did>
    </c04>
</c03>

Thanks in advance for any help!


回答1:


Try the following code:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
  <xsl:output indent="yes"></xsl:output>

<xsl:template match="node() | @*">
  <xsl:copy>
    <xsl:apply-templates select="node() | @*"/>
  </xsl:copy>
</xsl:template>

  <xsl:template match="c03/did">
    <xsl:choose>
      <xsl:when test="not(container)">
        <did>
          <!-- If no c03 container item is found, look in the c04 level for one -->
          <xsl:if test="../c04/did/container">
            <xsl:variable name="foo" select="../c04/did/container[@type='Box']/text()"/>
            <!-- If a c04 container item is found, use the info to build a c03 version -->
            <!-- Skip c03 container item, if still no c04 items found -->
            <container label="Box" type="Box">

              <!-- Build container list -->
              <!-- Test for more than one item, and if so, list them, -->
              <!-- separated by commas and a space -->
              <xsl:for-each select="distinct-values($foo)">
                <xsl:sort />
                <xsl:if test="position() &gt; 1">, </xsl:if>
                <xsl:value-of select="." />
              </xsl:for-each>
            </container>
            <xsl:apply-templates select="*" />
          </xsl:if>
        </did>
      </xsl:when>

      <!-- If there is a c03 container item(s), list it normally -->
      <xsl:otherwise>
        <xsl:copy-of select="."/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

</xsl:stylesheet>

It looks pretty much as the output you want:

<?xml version="1.0" encoding="UTF-8"?>
<c03 id="ref6488" level="file">
  <did>
      <container label="Box" type="Box">154, 156</container>
      <unittitle>Clinic Building</unittitle>
      <unitdate era="ce" calendar="gregorian">1947</unitdate>
   </did>
  <c04 id="ref34582" level="file">
      <did>
         <container label="Box" type="Box">156</container>
         <container label="Folder" type="Folder">3</container>
      </did>
  </c04>
  <c04 id="ref6540" level="file">
      <did>
         <container label="Box" type="Box">156</container>
         <unittitle>Contact prints</unittitle>
      </did>
  </c04>
  <c04 id="ref6606" level="file">
      <did>
         <container label="Box" type="Box">154</container>
         <unittitle>Negatives</unittitle>
      </did>
  </c04>
</c03>

The trick is to use <xsl:sort> and distinct-values() together. See the (IMHO) great book from Michael Key "XSLT 2.0 and XPATH 2.0"




回答2:


There is no need for an XSLT 2.0 solution for this problem.

Here is an XSLT 1.0 solution, which is more compact than the currently selected XSLT 2.0 solution (35 lines vs. 43 lines):

<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:key name="kBoxContainerByVal"
     match="container[@type='Box']" use="."/>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match="c03/did[not(container)]">
   <xsl:copy>

   <xsl:variable name="vContDistinctValues" select=
    "/*/*/*/container[@type='Box']
            [generate-id()
            =
             generate-id(key('kBoxContainerByVal', .)[1])
            ]
            "/>

    <container label="Box" type="Box">
      <xsl:for-each select="$vContDistinctValues">
        <xsl:sort data-type="number"/>

        <xsl:value-of select=
        "concat(., substring(', ', 1 + 2*(position() = last())))"/>
      </xsl:for-each>
    </container>
    <xsl:apply-templates/>
   </xsl:copy>
 </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the originally provided XML document, the correct, wanted result is produced:

<c03 id="ref6488" level="file">
   <did>
      <container label="Box" type="Box">156, 154</container>
      <unittitle>Clinic Building</unittitle>
      <unitdate era="ce" calendar="gregorian">1947</unitdate>
   </did>
   <c04 id="ref34582" level="file">
      <did>
         <container label="Box" type="Box">156</container>
         <container label="Folder" type="Folder">3</container>
      </did>
   </c04>
   <c04 id="ref6540" level="file">
      <did>
         <container label="Box" type="Box">156</container>
         <unittitle>Contact prints</unittitle>
      </did>
   </c04>
   <c04 id="ref6606" level="file">
      <did>
         <container label="Box" type="Box">154</container>
         <unittitle>Negatives</unittitle>
      </did>
   </c04>
</c03>

Update:

I didn't notice the requirement that the container numbers must appear sorted. Now the solution reflects this.




回答3:


try using a Key group in xslt, here's an article on the Muenchian method which should help to eliminate duplicates. http://www.jenitennison.com/xslt/grouping/muenchian.html




回答4:


A slightly shorter XSLT 2.0 version, combining approaches from other answers. Note that sorting is alphabetical, so that if the labels "54" and "156" are found, the output will be "156, 54". If a numerical sort is needed, use <xsl:sort select="number(.)"/> instead of <xsl:sort/>.

<xsl:stylesheet version="2.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/> 
    <xsl:strip-space elements="*"/>

    <xsl:template match="node()|@*"> 
        <xsl:copy> 
            <xsl:apply-templates select="node()|@*"/> 
        </xsl:copy> 
    </xsl:template> 

    <xsl:template match="c03/did[not(container)]">
        <xsl:variable name="containers" 
                      select="../c04/did/container[@label='Box'][text()]"/>
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <xsl:if test="$containers">
                <container label="Box" type="Box">
                    <xsl:for-each select="distinct-values($containers)">
                        <xsl:sort/>
                        <xsl:if test="position() != 1">, </xsl:if>
                        <xsl:value-of select="."/>
                    </xsl:for-each>
                </container> 
            </xsl:if>
            <xsl:apply-templates select="node()"/> 
        </xsl:copy> 
    </xsl:template> 
</xsl:stylesheet>



回答5:


A truly XSLT 2.0 solution, also quite short:

<xsl:stylesheet  version="2.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  exclude-result-prefixes="xs"
>
  <xsl:output omit-xml-declaration="yes" indent="yes"/>

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="c03/did[not(container)]">
    <xsl:copy>
      <xsl:copy-of select="@*"/>

      <xsl:variable name="vContDistinctValues" as="xs:integer*">
        <xsl:perform-sort select=
          "distinct-values(/*/*/*/container[@type='Box']/text()/xs:integer(.))">
          <xsl:sort/>
        </xsl:perform-sort>
      </xsl:variable>

      <xsl:if test="$vContDistinctValues">
        <container label="Box" type="Box">
          <xsl:value-of select="$vContDistinctValues" separator=","/>
        </container>
      </xsl:if>
      <xsl:apply-templates/>
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

Do note:

  1. The use of types avoids the need to specify the data-type in <xsl:sort/> .

  2. The use of the separator attribute of <xsl:value-of/>




回答6:


The following XSLT 1.0 transformation does what you are looking for

<xsl:stylesheet 
  version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
> 
  <xsl:output encoding="utf-8" />

  <!-- key to index containers by these three distinct qualities: 
       1: their ancestor <c??> node (represented as its unique ID)
       2: their @type attribute value
       3: their node value (i.e. their text) -->
  <xsl:key 
    name  = "kContainer" 
    match = "container"
    use   = "concat(generate-id(../../..), '|', @type, '|', .)"
  />

  <!-- identity template to copy everything as is by default -->
  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*" />
    </xsl:copy>
  </xsl:template>

  <!-- special template for <did>s without a <container> child -->
  <xsl:template match="did[not(container)]">
    <xsl:copy>
      <xsl:copy-of select="@*" />
      <container label="Box" type="Box">
        <!-- from subordinate <container>s of type Box, use the ones
             that are *the first* to have that certain combination 
             of the three distinct qualities mentioned above -->
        <xsl:apply-templates mode="list-values" select="
          ../*/did/container[@type='Box'][
            generate-id()
            =
            generate-id(
              key(
                'kContainer', 
                concat(generate-id(../../..), '|', @type, '|', .)
              )[1]
            )
          ]
        ">
          <!-- sort them by their node value -->
          <xsl:sort select="." data-type="number" />
        </xsl:apply-templates>
      </container>
      <xsl:apply-templates select="node()" />
    </xsl:copy>
  </xsl:template>

  <!-- generic template to make list of values from any node-set -->
  <xsl:template match="*" mode="list-values">
    <xsl:value-of select="." />
    <xsl:if test="position() &lt; last()">
      <xsl:text>, </xsl:text>
    </xsl:if>
  </xsl:template>

</xsl:stylesheet>

Returns

<c03 id="ref6488" level="file">
  <did>
    <container label="Box" type="Box">154, 156</container>
    <unittitle>Clinic Building</unittitle>
    <unitdate era="ce" calendar="gregorian">1947</unitdate>
  </did>
  <c04 id="ref34582" level="file">
    <did>
      <container label="Box" type="Box">156</container>
      <container label="Folder" type="Folder">3</container>
    </did>
  </c04>
  <c04 id="ref6540" level="file">
    <did>
      <container label="Box" type="Box">156</container>
      <unittitle>Contact prints</unittitle>
    </did>
  </c04>
  <c04 id="ref6606" level="file">
    <did>
      <container label="Box" type="Box">154</container>
      <unittitle>Negatives</unittitle>
    </did>
  </c04>
</c03>

The generate-id() = generate-id(key(...)[1]) part is what's called Muenchian grouping. Unless you can use XSLT 2.0, this is the way to go.



来源:https://stackoverflow.com/questions/2669813/how-do-i-prevent-duplicates-in-xsl

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!