Java: How to prevent 'systemId' in EntityResolver#resolveEntity(String publicId, String systemId) from being absolutized to current working directory

旧巷老猫 提交于 2019-12-05 20:37:27

问题


I want to parse the following XML document to resolve all entities in it:

 <!DOCTYPE doc SYSTEM 'mydoc.dtd'>
 <doc>&title;</doc>

My EntityResolver is supposed to fetch the external entity with the given system ID from the database and then do the resolution, see below for an illustration:

 private static class MyEntityResolver
 {
    public InputSource resolveEntity(String publicId, String systemId)
        throws SAXException, IOException
    {
        // At this point, systemId is always absolutized to the current working directory, 
        // even though the XML document specified it as relative.
        // E.g. "file:///H:/mydoc.dtd" instead of just "mydoc.dtd"
        // Why???  How can I prevent this???

        SgmlEntity entity = findEntityFromDatabase(systemId);
        InputSource is = new InputSource(new ByteArrayInputStream(entity.getContents()));
        is.setPublicId(publicId);
        is.setSystemId(systemId);
        return is;
    }
 }

I tried both using DOM (DocumentBuilder) and SAX (XMLReader), set the entity resolver to MyEntityResolver (i.e. setEntityResolver(new MyEntityResolver())), but systemId in MyEntityResolver#resolveEntity(String publicId, String systemId) is always being absolutized to the current working directory.

I also tried calling setFeature("http://xml.org/sax/features/resolve-dtd-uris", false);, but that didn't help anything.

So how can I achieve what I wanted?

Thanks!


回答1:


Apparently, there is another interface called EntityResolver2 which is the extension of the old EntityResolver. (Talk about confusing names!)

Anyway, I found that EntityResolver2 achieved what I wanted, that is, it does not make any changes to the systemId, so it will always exactly be what was specified in the XML document.




回答2:


From the EntityResolver Javadocs:

If the system identifier is a URL, the SAX parser must resolve it fully before reporting it to the application.

Also, the org.xml.sax docs have this to say about the resolve-dtd-uris feature:

It does not apply to EntityResolver.resolveEntity(), which is not used to report declarations...

I think you've either got to set your base-URI to something you can live with, or use public-IDs instead of system-IDs.



来源:https://stackoverflow.com/questions/1648291/java-how-to-prevent-systemid-in-entityresolverresolveentitystring-publicid

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!