How to remove duplicate XML nodes using XSLT

前端 未结 3 649
日久生厌
日久生厌 2020-12-03 19:47

I\'ve got an extremely long XML file, like


   
      context1
      test1
         


        
相关标签:
3条回答
  • 2020-12-03 20:25

    If the OP's provided XML is representative of his/her question (and the 2nd <child1> inside each <ele*> element should be removed), then Muenchian Grouping isn't necessary:

    XSLT:

    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
      <xsl:output omit-xml-declaration="no" indent="yes"/>
      <xsl:strip-space elements="*"/>
    
      <!-- Identity Template: copies everything as-is -->
      <xsl:template match="node()|@*">
        <xsl:copy>
          <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
      </xsl:template>
    
      <!-- Remove the 2nd <child1> element from each <ele*> element -->
      <xsl:template match="*[starts-with(name(), 'ele')]/child1[2]" />
    
    </xsl:stylesheet>
    

    When run against the provided XML:

    <?xml version="1.0" encoding="UTF-8"?>
    <Root>
      <ele1>
        <child1>context1</child1>
        <child2>test1</child2>
        <child1>context1</child1>
      </ele1>
      <ele2>
        <child1>context2</child1>
        <child2>test2</child2>
        <child1>context2</child1>
      </ele2>
    </Root>
    

    ...the desired result is produced:

    <?xml version="1.0" encoding="UTF-8"?>
    <Root>
      <ele1>
        <child1>context1</child1>
        <child2>test1</child2>
      </ele1>
      <ele2>
        <child1>context2</child1>
        <child2>test2</child2>
      </ele2>
    </Root>
    
    0 讨论(0)
  • 2020-12-03 20:31

    Your xml and question are kind of unclear, but what you're looking for is commonly called the Muenchian Grouping method - it's another way of asking for distinct nodes. With the appropriate keys this can be done very efficiently.

    0 讨论(0)
  • 2020-12-03 20:44

    This question requires a little bit more detailed answer than just pointing to a good Muenchian Grouping source.

    The reason is that the needed grouping requires to identify both the names of all children of an "ele[SomeString]" element and their parent. Such grouping requires to define a key that is uniquely defined by both unique sources, usually via concatenation.

    This transformation:

    <xsl:stylesheet version="1.0"
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
     <xsl:output omit-xml-declaration="yes" indent="yes"/>
    
     <xsl:key name="kElByName" match="*"
          use="concat(generate-id(..), '+',name())"/>
    
        <xsl:template match="node()|@*">
          <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
          </xsl:copy>
        </xsl:template>
    
        <xsl:template match="*[starts-with(name(), 'ele')]">
          <xsl:copy>
            <xsl:copy-of select="@*"/>
            <xsl:apply-templates select=
             "*[generate-id()
               =
                generate-id(key('kElByName',
                            concat(generate-id(..), '+',name())
                            )[1])
                ]"
             />
          </xsl:copy>
        </xsl:template>
    </xsl:stylesheet>
    

    when applied on this XML document:

    <Root>
        <ele1>
            <child1>context1</child1>
            <child2>test1</child2>
            <child1>context1</child1>
        </ele1>
        <ele2>
            <child1>context2</child1>
            <child2>test2</child2>
            <child1>context2</child1>
        </ele2>
        <ele3>
            <child2>context2</child2>
            <child2>test2</child2>
            <child1>context1</child1>
        </ele3>
    </Root>
    

    produces the wanted result:

    <Root>
        <ele1>
            <child1>context1</child1>
            <child2>test1</child2>
        </ele1>
        <ele2>
            <child1>context2</child1>
            <child2>test2</child2>
        </ele2>
        <ele3>
            <child2>context2</child2>
            <child1>context1</child1>
        </ele3>
    </Root>
    
    0 讨论(0)
提交回复
热议问题