问题
I'm putting some page content (which has been run through Tidy, but doesn't need to be if this is a source of problems) into DOMDocument
using DOMDocument::loadHTML
.
It's coming up with various errors:
ID x already defined in Entity, line X
Is there any way to make either DOMDocument
(or Tidy) ignore or strip out duplicate element IDs, so it will actually create the DOMDocument
?
Thanks. :)
回答1:
A quick search on the subject reveals this (incorrect) bug report:
http://bugs.php.net/bug.php?id=46136
The last reply states the following:
You're using HTML 4 rules to load an XHTML document. Either use the load() method to parse as XML or the libxml_use_internal_errors() function to ignore the warnings.
I can't be sure if you are encountering this problem for the same reasons, since you did not include a reference to the HTML page being loaded. In any case, using libxml_use_internal_errors() should at least suppress the error.
ID's in HTML documents are generally unique, so the best solution would still be validating your document, if at all possible.
回答2:
By definition, IDs are unique. If they are not, you should use classes instead (nor names, where it applies).
I doubt you can force XML tools to ignore duplicate IDs, that will make them handle an invalid XML document.
回答3:
Use Exceptions to treat duplicate IDs, and rename the second id. Or maybe, combine elements in sub-elements of same parent with the ID.
IDs are unique in an XML file (in the rootElement of XMLTree)
来源:https://stackoverflow.com/questions/415927/domdocument-ignore-duplicate-element-ids