Java, xml, XSLT: Prevent DTD-Validation

前端 未结 5 1318
醉酒成梦
醉酒成梦 2020-12-16 06:28

I use the Java (6) XML-Api to apply a xslt transformation on a html-document from the web. This document is wellformed xhtml and so contains a valid DTD-Spec (

相关标签:
5条回答
  • 2020-12-16 06:37

    You need to be using javax.xml.parsers.DocumentBuilderFactory

    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    factory.setValidating(false);
    DocumentBuilder builder = factory.newDocumentBuilder();
    InputSource src = new InputSource("http://de.wikipedia.org/wiki/Right_Livelihood_Award")
    Document xmlDocument = builder.parse(src.getByteStream());
    DOMSource source = new DOMSource(xmlDocument);
    TransformerFactory tf = TransformerFactory.newInstance();
    Transformer transformer = tf.newTransformer(xsltSource);
    transformer.transform(source, new StreamResult(System.out));
    
    0 讨论(0)
  • 2020-12-16 06:41

    I recently had this issue while unmarshalling XML using JAXB. The answer was to create a SAXSource from an XmlReader and InputSource, then pass that to the JAXB UnMarshaller's unmarshal() method. To avoid loading the external DTD, I set a custom EntityResolver on the XmlReader.

    SAXParserFactory spf = SAXParserFactory.newInstance();
    SAXParser sp = spf.newSAXParser();
    XMLReader xmlr = sp.getXMLReader();
    xmlr.setEntityResolver(new EntityResolver() {
        public InputSource resolveEntity(String pid, String sid) throws SAXException {
            if (sid.equals("your remote dtd url here"))
                return new InputSource(new StringReader("actual contents of remote dtd"));
            throw new SAXException("unable to resolve remote entity, sid = " + sid);
        } } );
    SAXSource ss = new SAXSource(xmlr, myInputSource);
    

    As written, this custom entity resolver will throw an exception if it's ever asked to resolve an entity OTHER than the one you want it to resolve. If you just want it to go ahead and load the remote entity, remove the "throws" line.

    0 讨论(0)
  • 2020-12-16 06:49

    The previous answers led me to a solution but is wasn't obvious for me so here is a complete one:

    private void convert(InputStream xsltInputStream, InputStream srcInputStream, OutputStream destOutputStream) throws SAXException, ParserConfigurationException,
            TransformerFactoryConfigurationError, TransformerException, IOException {
        //create a parser with a fake entity resolver to disable DTD download and validation
        XMLReader xmlReader = SAXParserFactory.newInstance().newSAXParser().getXMLReader();
        xmlReader.setEntityResolver(new EntityResolver() {
            public InputSource resolveEntity(String pid, String sid) throws SAXException {
                return new InputSource(new ByteArrayInputStream(new byte[] {}));
            }
        });
        //create the transformer
        Source xsltSource = new StreamSource(xsltInputStream);
        Transformer transformer = TransformerFactory.newInstance().newTransformer(xsltSource);
        //create the source for the XML document which uses the reader with fake entity resolver
        Source xmlSource = new SAXSource(xmlReader, new InputSource(srcInputStream));
        transformer.transform(xmlSource, new StreamResult(destOutputStream));
    }
    
    0 讨论(0)
  • 2020-12-16 06:52

    if you use

    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    

    you can try disable the dtd validation with the fllowing code:

     dbf.setValidating(false);
    
    0 讨论(0)
  • 2020-12-16 06:53

    Try setting a feature in your DocumentBuilderFactory:

    URL url = new URL(urlString);
    InputStream is = url.openStream();
    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
    DocumentBuilder db;
    db = dbf.newDocumentBuilder();
    Document result = db.parse(is);
    

    Right now I'm experiencing the same problems inside XSLT(2) when calling the document function to analyse external XHTML-pages.

    0 讨论(0)
提交回复
热议问题