问题
I have an XML file. It is nearly correct, but it is not.
Error on line 302211.
Extra Content at the end of the document.
I've spent literally two days trying to debug this, but the file is so big it's nearly impossible. Is there anything I can do ?
Here are the relevant lines also (I include 2 lines before the error code, the error begins on the <seg>
tag).
<tu>
<tuv xml:lang="en">
<prop type="feed"></prop>
<seg>
<bpt i="1" x="1" type="feed">
test
</bpt>
To switch on computer:
<ept i="1">
>
</ept>
Press device
<ph x="2" type="feed">
<schar _TR="123" y.io.name
</ph> or
<ph x="3" type="feed">
<schar _TR="274" y.io.name="
</ph> (Spain) twice.
</seg>
</tuv>
</tu>
Can anyone give me some pointers on finding the issue here? I am using the Notepad++ XML plugin.
回答1:
Background notes
- The XML fragment you've posted stands on its own as a well-formed XML document – the problem must be somewhere else in your XML.
- Your particular XML problem is well-formedness, not validity.
Tips for finding XML well-formedness problems
- Use an XML parser with better diagnostic messages. Xerces-based tools have very good messages (albeit with a few exceptions).
- Know the common problems that cause an XML document not to be
well-formed:
- Missing or mismatched element closing tag.
- Missing or mismatched attribute quote delimiter.
- < or & in content rather than < or &.
- Multiple root elements.
- Incomplete markup after the root element.
- Multiple XML declarations, or an XML declaration appears other than at the top of the document.
Divide and conquer. Consider this sketch of a huge XML document:
<root> <First> <FirstChild> <!-- Tons of descendent markup --> </FirstChild> <SecondChild> <!-- Tons of descendent markup --> </SecondChild> </First> <Second> <!-- Tons of descendent markup --> </Second> </root>
Process of elimination:
- Delete the
First
element. - Revalidate.
- If error goes away, restore
First
element and removeSecond
element. - Else, remove
FirstChild
element. - Repeat until error can be more easily spotted in the reduced XML document.
- Delete the
See also
- How to parse invalid (bad / not well-formed) XML?
来源:https://stackoverflow.com/questions/47531968/how-can-i-validate-my-3-000-000-line-long-xml-file