decode large base64 from xml in java: OutOfMemory

孤人 提交于 2019-12-03 08:46:43

Try the StAX API (tutorial). For large text elements, you should get several text events which you need to push into a streaming Base64 implementation (like the one skaffman mentioned).

Apache Commons Codec has a Base64OutputStream, which should allow you to feed the XML data in a scalable way, by chaining the Base64OutputStream with a FileOutputStream.

You'll need a representation of the XML as a String, so you may not even have to read it into a DOM structure at all.

Something like:

PrintWriter printWriter = new PrintWriter(
   new Base64OutputStream(
      new BufferedOutputStream(
         new FileOutputStream("/path/to/my/file")
      )
   )
);
printWriter.write(myXml);
printWriter.close();

If the input XML file is too big, then you should read chunks of it into a buffer in a loop, writing the buffer contents to the output (i.e. a standard reader-to-writer copy).

I don't think any XML api would let you access an element's text as a stream rather than a String. If the String is 100 MB, then your only option is probably to change the JVM's heap size until you don't have any OutOfMemoryError :

java -Xmx256m your.class.Name

If your file can get that big, never use a DOM parser. Use a simple SAX approach to access the data elements, and stream the base64 data into Base64OutputStream as mentioned above.

Jörn Horstmann

As lbruder said, use a SAX parser to read the document in a streaming fashion. If you use Base64OutputStream then you have to set the flag to let it DECODE instead of the default ENCODE. You also have to convert the char array from the characters callback to a byte array before passing it to the outputstream, needing additional memory allocations and copies.

I wrote an alternative base64 decoder for exactly this usecase, it is available at github. Here is an example on how to use it:

Base64StreamDecoder decoder = new Base64StreamDecoder();
OutputStream out;

...

public void startElement(String uri, String localName, String qName, Attributes atts) {
    decoder.reset();
    out = new BufferedOutputStream(new FileOutputStream(...));
}

public void endElement(String uri, String localName, String qName) {
    decoder.checkComplete();
    out.close();
}

public void characters(char[] ch, int start, int length) {
    decoder.decode(ch, start, length, out);
}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!