问题
I have several XML files which have a similar structure but with some differences that I cannot overlook. They are all TEI documents.
I am looking for a way to outline the main structure.
Take the following text as an example:
<text xmlns="http://www.tei-c.org/ns/1.0" xml:id="d1">
<body xml:id="d2">
<div1 type="book" xml:id="d3">
<head>Songs of Innocence</head>
<pb n="4"/>
<div2 type="poem" xml:id="d4">
<head>Introduction</head>
<lg type="stanza">
<l>Piping down the valleys wild, </l>
<l>Piping songs of pleasant glee, </l>
<l>On a cloud I saw a child, </l>
<l>And he laughing said to me: </l>
</lg>
I would like to suppress the nodes of the same type and all the repeating structures:
<body xml:id="d2">
<div1 type="book" xml:id="d3">
<head>Songs of Innocence</head>
<pb n="4"/>
<div2 type="poem" xml:id="d4">
<head>Introduction</head>
<lg type="stanza">
<l>...</l>
</lg>
<lg>...</lg>
So, basically I want to reduce the XML document to its most basic structure. In this way I can figure out how to properly convert them using XSLT.
回答1:
Here are some options for viewing your XML in a tree structure:
- Open the XML in a web browser and get an outline view with collapsible elements.
- Open the XML in graphics view in Oxygen, QTAssistant, or XMLSpy.
- Use Graphviz or DotML ant build to create your own visual representations.
Note, however, that you'll need to clean up your markup. What you show doesn't qualify as XML as it's missing end tags and lacks a single root element. (XML has to be well-formed.)
回答2:
Using perl XML::DT, (apt-get install libxml-dt-perl
if not installed),
the command mkxmltype file.xml
returns a compact description of the
xml structure. Example
$ mkxmltype -lines=1000 a.xml
# text ...Fri Feb 26 17:56:24 2016
text => body * xml:id
body => div1 * xml:id
div1 => tup(div2, pb, head) * type * xml:id
div2 => tup(head, lg) * type * xml:id
pb => empty * n
head => text
lg => seq(l) * type
l => text
来源:https://stackoverflow.com/questions/35657962/visualize-xml-tree-structure