why org.apache.xerces.parsers.SAXParser does not skip BOM in utf8 encoded xml?

前端 未结 3 559
南方客
南方客 2020-12-07 01:54

I have an xml with utf8 encoding. And this file contains BOM a beginning of the file. So during parsing I am facing with org.xml.sax.SAXParseException: Content is not allowe

3条回答
  •  挽巷
    挽巷 (楼主)
    2020-12-07 02:34

    I've experienced the same problem and I've solved it with this code:

    private static InputStream checkForUtf8BOM(InputStream inputStream) throws IOException {
        PushbackInputStream pushbackInputStream = new PushbackInputStream(new BufferedInputStream(inputStream), 3);
        byte[] bom = new byte[3];
        if (pushbackInputStream.read(bom) != -1) {
            if (!(bom[0] == (byte) 0xEF && bom[1] == (byte) 0xBB && bom[2] == (byte) 0xBF)) {
                pushbackInputStream.unread(bom);
            }
        }
        return pushbackInputStream;
    }
    

提交回复
热议问题