Is there a Java XML API that can parse a document without resolving character entities?

不羁岁月 提交于 2019-11-26 21:06:50

The STaX API has support for the notion of not replacing character entity references, by way of the IS_REPLACING_ENTITY_REFERENCES property:

Requires the parser to replace internal entity references with their replacement text and report them as characters

This can be set into an XmlInputFactory, which is then in turn used to construct an XmlEventReader or XmlStreamReader. However, the API is careful to say that this property is only intended to force the implementation to perform the replacement, rather than forcing it to not replace them. Still, it's got to be worth a try.

Jim Ferrans

A SAX parse with an org.xml.sax.EntityResolver might suit your purpose. You could for sure suppress them, and you could probably find a way to leave them unresolved.

This tutorial seems the most relevant: it shows how to resolve entities into strings.

I am not a Java developer, but I "think" Java xml classes support a similar functionality to .net for accomplishing this. IN .net the xmlreadersettings class you set the ProhibitDtd property false and set the XmlResolver property to null. This will cause the parser to ignore externally referenced entities without throwing an exception when they are read. I just did a google search for "Java ignore enity" and got lots of hits, some of which appear to address this topic. I realize this is not a total answer to your question but it should point you in a useful direction.

user2050348

Works for me only when disabling support of external entities:

XMLInputFactory inputFactory = XMLInputFactory.newInstance();
inputFactory.setProperty(XMLInputFactory.IS_REPLACING_ENTITY_REFERENCES, false);
inputFactory.setProperty(XMLInputFactory.IS_SUPPORTING_EXTERNAL_ENTITIES, false);
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!