grouping following-siblings with same name and same attributes causes exception in saxon

时间秒杀一切 提交于 2019-12-02 20:33:35

问题


I have some xml documents (similar to docbook) that have to be transformed to xsl-fo. Some of the documents contains poems, and the lines of the poems are written in separate p tags. The verses are separated by br tags. There are "page" tags that are irrelevant and should be ignored.

Typical code example:

<h4>Headline</h4>
<p>1st line of 1st verse</p>
<p>2nd line of 1st verse</p>
<br/>
<p>1st line of 2nd verse</p>
<p>2nd line of 2nd verse</p>
<page n="100"/>
<p>3rd line of 2nd verse</p>
<h4>Other headline</h4>

For the xsl-fo output, I would like to gather all the text of a verse into one single fo:block. Right now the mechanism works for code structures as above, but there are some exceptions. The actual way of doing it is to decide for every p tag: - Am I the first line of a verse? - If yes: collect all the text of this verse ynd write it into a fo:block, use the attributes of the actual (first) p tag to set the formatting of the block - If no: contents were treated ealrier, do nothing.

A first line is a p tag that is immediately preceded by a h4 or a br tag (or a page tag which itself is immediately preceded by a br tag). That one was easy to develop.

Collecting the text of a verse was easy for the given example: Group all following siblings, defining the groups ends by h4 or br tags, then I take the first group and use all p tags (ignore in between page tags or the ending h4 or br tag).

In code:

<xsl:for-each-group select="following-sibling::*" group-ending-with="br|h4">
    <xsl:if test="position()=1">
        <xsl:for-each select="current-group()[not(self::h4) and not(self::br) and not(self::page)]">
            <xsl:apply-templates/>&crt;
        </xsl:for-each>
    </xsl:if>
</xsl:for-each-group>

Now to a similar code example:

<h4>Headline</h4>
<p class="center">1</p>
<p>1st line of 1st verse</p>
<p>2nd line of 1st verse</p>
<br/>
<p class="center">2</p>
<p>1st line of 2nd verse</p>
<p>2nd line of 2nd verse</p>
<page n="100"/>
<p>3rd line of 2nd verse</p>
<h4>Other headline</h4>

Now the centered p are like a subheadlines to the following verses. It is not really a verse, but for my purposes it would be enough if it would be separated from the real verse's text. Thus the slightly varied rule for getting all the text of the current verse is: Group all following siblings, defining the groups ends by h4 or br tags or by a p tag that has another class then the current p tag , then I take the first group and use all p tags (ignore in between page tags or the ending h4 or br tag).

Therefore I stored the value of the class attribute of the current p tag in a variable called attributes and defined the the group rule as:

<xsl:for-each-group select="following-sibling::*" group-ending-with="br|h4|p[normalize-space(@class) != $attributes]">

In eturn, when trying to determine if a p tag is the first line of a verse, it cannot only be preceded by a h4 or br, but also by another p tag that has a different class attribute value.

Now this works fine in my testing environment in Oxygen using Saxon-B9.1.0.6. But the transformation has to be performed in java using Saxon9.jar, and there the usage of a variable inside the group-ending-with attribute of the xsl:for-each-group causes an exception.

And now I am kind of stuck.

COuld the grouping conditions be defined in a better way? Or should this maybe not be done with grouping at all, but with a totally different approach?

The source files are as they are, the tagging might not be optimal, but it is as it is. The transformation is not new but was subsequently adapted to our needs. Source code with poems in it was simply avoided earlier, but I'd like to find a solution for this.

Any help would be greatly appreciated.

Best regards,

Christian Kirchhoff


回答1:


This stylesheet:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:template match="div[@class='poem']">
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <xsl:for-each-group select="*" group-ending-with="br|h4">
                <div class="strophe">
                    <xsl:copy-of select="current-group()/self::p[not(@class)]"/>
                </div>
            </xsl:for-each-group>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

With this input:

<div class="poem">
    <h4>Headline</h4>
    <p>1st line of 1st verse</p>
    <p>2nd line of 1st verse</p>
    <br/>
    <p>1st line of 2nd verse</p>
    <p>2nd line of 2nd verse</p>
    <page n="100"/>
    <p>3rd line of 2nd verse</p>
</div>

Output:

<div class="poem">
    <div class="strophe">
        <p>1st line of 1st verse</p>
        <p>2nd line of 1st verse</p>
    </div>
    <div class="strophe">
        <p>1st line of 2nd verse</p>
        <p>2nd line of 2nd verse</p>
        <p>3rd line of 2nd verse</p>
    </div>
</div>

With this input:

<div class="poem">
    <h4>Headline</h4>
    <p class="center">1</p>
    <p>1st line of 1st verse</p>
    <p>2nd line of 1st verse</p>
    <br/>
    <p class="center">2</p>
    <p>1st line of 2nd verse</p>
    <p>2nd line of 2nd verse</p>
    <page n="100"/>
    <p>3rd line of 2nd verse</p>
</div>

Output:

<div class="poem">
    <div class="strophe">
        <p>1st line of 1st verse</p>
        <p>2nd line of 1st verse</p>
    </div>
    <div class="strophe">
        <p>1st line of 2nd verse</p>
        <p>2nd line of 2nd verse</p>
        <p>3rd line of 2nd verse</p>
    </div>
</div>

So, this stylesheet:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:template match="div[@class='poems']">
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <xsl:for-each-group select="*[preceding-sibling::h4]"
                                group-starting-with="h4">
                <div class="poem">
                    <xsl:for-each-group select="current-group()"
                                        group-ending-with="br">
                        <div class="strophe">
                            <xsl:copy-of select="current-group()
                                                  /self::p[not(@class)]"/>
                        </div>
                    </xsl:for-each-group>
                </div>
            </xsl:for-each-group>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

With this input:

<div class="poems">
    <h3>Poems</h3>
    <h4>Headline</h4>
    <p>1st line of 1st verse</p>
    <p>2nd line of 1st verse</p>
    <br/>
    <p>1st line of 2nd verse</p>
    <p>2nd line of 2nd verse</p>
    <page n="100"/>
    <p>3rd line of 2nd verse</p>
    <h4>Headline</h4>
    <p class="center">1</p>
    <p>1st line of 1st verse</p>
    <p>2nd line of 1st verse</p>
    <br/>
    <p class="center">2</p>
    <p>1st line of 2nd verse</p>
    <p>2nd line of 2nd verse</p>
    <page n="100"/>
    <p>3rd line of 2nd verse</p>
</div>

Output:

<div class="poems">
    <div class="poem">
        <div class="strophe">
            <p>1st line of 1st verse</p>
            <p>2nd line of 1st verse</p>
        </div>
        <div class="strophe">
            <p>1st line of 2nd verse</p>
            <p>2nd line of 2nd verse</p>
            <p>3rd line of 2nd verse</p>
        </div>
    </div>
    <div class="poem">
        <div class="strophe">
            <p>1st line of 1st verse</p>
            <p>2nd line of 1st verse</p>
        </div>
        <div class="strophe">
            <p>1st line of 2nd verse</p>
            <p>2nd line of 2nd verse</p>
            <p>3rd line of 2nd verse</p>
        </div>
    </div>
</div>


来源:https://stackoverflow.com/questions/3823561/grouping-following-siblings-with-same-name-and-same-attributes-causes-exception

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!