I am new to transformations between different formats. My goal is to transfer a notation from a toolkit which is in a plain text format to svg. An easy example would be that I have an orange ellipse and the notation would be like this (x and y is the coordinate system so 0 and 0 means the ellipse is in the middle):
GRAPHREP
PEN color:$000000 w:2pt
FILL color:$ff7f00
ELLIPSE x:0pt y:0pt rx:114pt ry:70pt
and my desired output would be an svg code something like this(the cx and cy coordinate are randomly selected for the example):
<svg width="400" height="400" xmlns="http://www.w3.org/2000/svg" xmlns:svg="http://www.w3.org/2000/svg">
<g>
<ellipse fill="#ff7f00" stroke="#000000" stroke-width="2" stroke-dasharray="null" stroke-linejoin="null" stroke-linecap="null" cx="250" cy="250" id="svg_1" rx="114" ry="70"/>
</g>
</svg>
I found these two threads Parse text file with XSLT and XSL transform on text to XML with unparsed-text: need more depth where they transform plain text to xml with XSLT 2.0 and the unparsed-text() function and regex. In my example how would it be possible to get the commands like ELLIPSE(is a regex which recognizes the all uppercase words possible?) and the parameters(is it possible to get with Xpath from plain text anyhow?)? Is a good implementation doable in XSLT 2.0 or should I look for another method? Any help would be appreciated!
Below is an example of how you can load the text file using unparsed-text()
, and parse the content using xsl:analyze-text
to produce an intermediate XML document, and then transform that XML using a "push"-style stylesheet.
It shows an example of how to support ELLIPSE, CIRCLE and RECTANGLE text conversion. You may need to customize it a bit, but should give you an idea of what is possible. With the addition of regex and unparsed-text()
, XSLT 2.0 and 3.0 makes all sorts of text transformations possible that would have been extremely cumbersome or difficult in XSLT 1.0.
With a file called "drawing.txt" with the following content:
GRAPHREP
PEN color:$000000 w:2pt
FILL color:$ff7f00
ELLIPSE x:0pt y:0pt rx:114pt ry:70pt
GRAPHREP
PEN color:$000000 w:2pt
FILL color:$ff7f00
CIRCLE x:0pt y:0pt rx:114pt ry:70pt
GRAPHREP
PEN color:$000000 w:2pt
FILL color:$ff7f00
RECTANGLE x:0pt y:0pt width:114pt height:70pt
Executing the following XSLT in the same directory:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:local="local"
exclude-result-prefixes="xs"
version="2.0"
xmlns:svg="http://www.w3.org/2000/svg">
<xsl:output indent="yes"/>
<!--matches sequences of UPPER-CASE letters -->
<xsl:variable name="label-pattern" select="'[A-Z]+'"/>
<!--matches the "attributes" in the line i.e. w:2pt,
has two capture groups (1) => attribute name, (2) => attribute value -->
<xsl:variable name="attribute-pattern" select="'\s?(\S+):(\S+)'"/>
<!--matches a line of data for the drawing text,
has two capture groups (1) => label, (2) attribute data-->
<xsl:variable name="line-pattern" select="concat('(', $label-pattern, ')\s(.*)\n?')"/>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/">
<svg width="400" height="400">
<g>
<!-- Find the text patterns indicating the shape -->
<xsl:analyze-string select="unparsed-text('drawing.txt')"
regex="{concat('(', $label-pattern, ')\n((', $line-pattern, ')+)\n?')}">
<xsl:matching-substring>
<!--Convert text to XML -->
<xsl:variable name="drawing-markup" as="element()">
<!--Create an element for this group, using first matched pattern as the element name
(i.e. GRAPHREP => <GRAPHREP>) -->
<xsl:element name="{regex-group(1)}">
<!--split the second matched group for this shape into lines by breaking on newline-->
<xsl:variable name="lines" select="tokenize(regex-group(2), '\n')"/>
<xsl:for-each select="$lines">
<!--for each line, run through this process to create an element with attributes
(e.g. FILL color:$frf7f00 => <FILL color=""/>
-->
<xsl:analyze-string select="." regex="{$line-pattern}">
<xsl:matching-substring>
<!--create an element using the UPPER-CASE label starting the line -->
<xsl:element name="{regex-group(1)}">
<!-- capture each of the attributes -->
<xsl:analyze-string select="regex-group(2)" regex="\s?(\S+):(\S+)">
<xsl:matching-substring>
<!--convert foo:bar into attribute foo="bar",
translate $ => #
and remove the letters 'p' and 't' by translating into nothing"-->
<xsl:attribute name="{regex-group(1)}" select="translate(regex-group(2), '$pt', '#')"/>
</xsl:matching-substring>
<xsl:non-matching-substring/>
</xsl:analyze-string>
</xsl:element>
</xsl:matching-substring>
<xsl:non-matching-substring/>
</xsl:analyze-string>
</xsl:for-each>
</xsl:element>
</xsl:variable>
<!--Uncomment the copy-of below if you want to see the intermediate XML $drawing-markup-->
<!--<xsl:copy-of select="$drawing-markup"/>-->
<!-- Transform XML into SVG -->
<xsl:apply-templates select="$drawing-markup"/>
</xsl:matching-substring>
<xsl:non-matching-substring/>
</xsl:analyze-string>
</g>
</svg>
</xsl:template>
<!--==========================================-->
<!-- Templates to convert the $drawing-markup -->
<!--==========================================-->
<!--for supported shapes, create the element using
lower-case value, and change rectangle to rect
for the svg element name-->
<xsl:template match="GRAPHREP[ELLIPSE | CIRCLE | RECTANGLE]">
<xsl:element name="{replace(lower-case(local-name(ELLIPSE | CIRCLE | RECTANGLE)), 'rectangle', 'rect', 'i')}">
<xsl:attribute name="id" select="concat('id_', generate-id())"/>
<xsl:apply-templates />
</xsl:element>
</xsl:template>
<xsl:template match="ELLIPSE | CIRCLE | RECTANGLE"/>
<!-- Just process the content of GRAPHREP.
If there are multiple shapes and you want a new
<svg><g></g></svg> for each shape,
then move it from the template for "/" into this template-->
<xsl:template match="GRAPHREP/*">
<xsl:apply-templates select="@*"/>
</xsl:template>
<xsl:template match="PEN" priority="1">
<!--TODO: test if these attributes exist, if they do, do not create these defaults.
Hard-coding for now, to match desired output, since I don't know what the text
attributes would be, but could wrap each with <xsl:if test="not(@dasharray)">-->
<xsl:attribute name="stroke-dasharray" select="'null'"/>
<xsl:attribute name="stroke-linjoin" select="'null'"/>
<xsl:attribute name="stroke-linecap" select="'null'"/>
<xsl:apply-templates select="@*"/>
</xsl:template>
<!-- conterts @color => @stroke -->
<xsl:template match="PEN/@color">
<xsl:attribute name="stroke" select="."/>
</xsl:template>
<!--converts @w => @stroke-width -->
<xsl:template match="PEN/@w">
<xsl:attribute name="stroke-width" select="."/>
</xsl:template>
<!--converts @color => @fill and replaces $ with # -->
<xsl:template match="FILL/@color">
<xsl:attribute name="fill" select="translate(., '$', '#')"/>
</xsl:template>
<!-- converts @x => @cx with hard-coded values.
May want to use value from text, but matching your example-->
<xsl:template match="ELLIPSE/@x | ELLIPSE/@y">
<!--not sure if there was a relationship between ELLIPSE x:0pt y:0pt, and why 0pt would be 250,
but just an example...-->
<xsl:attribute name="c{name()}" select="250"/>
</xsl:template>
</xsl:stylesheet>
Produces the following SVG output:
<?xml version="1.0" encoding="UTF-8"?>
<svg xmlns:local="local"
xmlns:svg="http://www.w3.org/2000/svg"
width="400"
height="400">
<g>
<ellipse id="id_d2e0"
stroke-dasharray="null"
stroke-linjoin="null"
stroke-linecap="null"
stroke="#000000"
stroke-width="2"
fill="#ff7f00"
cx="250"
cy="250"
rx="114"
ry="70"/>
<circle id="id_d3e0"
stroke-dasharray="null"
stroke-linjoin="null"
stroke-linecap="null"
stroke="#000000"
stroke-width="2"
fill="#ff7f00"
x="0"
y="0"
rx="114"
ry="70"/>
<rect id="id_d4e0"
stroke-dasharray="null"
stroke-linjoin="null"
stroke-linecap="null"
stroke="#000000"
stroke-width="2"
fill="#ff7f00"
x="0"
y="0"
width="114"
height="70"/>
</g>
</svg>
XSLT is really handy when you want to transform an XML file into something else. But it's quite difficult to have advanced logic, because the syntax is clumsy and all variables are constants. Your parsing seems advanced and your input is not XML, so XSLT would not be my first tool of choice here.
If the tool is not fixed, i'd write a parser in a scripting language, e.g. python, have it create objects from the input and then have each object produce its own representation as XML.
Edit: A super simple Python script could look like this, you can easily rewrite it in Java:
import re
from xml.etree.ElementTree import Element, SubElement, tostring
def only_numbers(s):
return re.sub(r"[^0-9]", "", s)
def only_hex(s):
return re.sub(r"[^0-9a-f]", "", s)
# Initialise svg xml
root = Element("svg")
root.set("width", "400")
root.set("height", "400")
root.set("xmlns", "http://www.w3.org/2000/svg")
g = SubElement(root, "g")
# Initialise current state
pen_color = "000000"
pen_width = "1"
fill_color = "ffffff"
# Read input
with open("parsing_data.txt") as f:
for line in f:
words = line.split()
action = words[0]
params = dict([word.split(":") for word in words[1:]])
if action == "GRAPHREP":
pass
elif action == "PEN":
if "color" in params:
pen_color = only_hex(params["color"])
if "w" in params:
pen_width = only_numbers(params["w"])
elif action == "FILL":
if "color" in params:
fill_color = only_hex(params["color"])
elif action == "ELLIPSE":
ellipse = SubElement(g, "ellipse")
ellipse.set("fill", "#" + fill_color)
ellipse.set("fill", "#" + fill_color)
ellipse.set("stroke", "#" + pen_color)
ellipse.set("stroke-width", pen_width)
ellipse.set("cx", only_numbers(params["x"]))
ellipse.set("cy", only_numbers(params["y"]))
ellipse.set("rx", only_numbers(params["rx"]))
ellipse.set("ry", only_numbers(params["ry"]))
else:
raise Exception("Invalid input line: " + line)
# Print result
print tostring(root)
来源:https://stackoverflow.com/questions/34682331/xslt-2-0-transform-notation-in-plain-text-to-svg