How to read XLSX file of size >40MB

后端 未结 1 925
不知归路
不知归路 2020-12-11 16:13

I am using XSSF of apache-POI to read the XLSX file. I was getting an error java.lang.OutOfMemoryError: Java heap space. Later, increa

相关标签:
1条回答
  • 2020-12-11 16:55

    POI allows you to read excel files in a streaming manner. The API is pretty much a wrapper around SAX. Make sure you open the OPC package in the correct way, using the constructor that takes a String. Otherwise you could run out of memory immediately.

    OPCPackage pkg = OPCPackage.open(file.getPath());
    XSSFReader reader = new XSSFReader(pkg);
    

    Now, reader will allow you to get InputStreams for the different parts. If you want to do the XML parsing yourself (using SAX or StAX), you can use these. But it requires being very familiar with the format.

    An easier option is to use XSSFSheetXMLHandler. Here is an example that reads the first sheet:

    StylesTable styles = reader.getStylesTable();
    ReadOnlySharedStringsTable sharedStrings = new ReadOnlySharedStringsTable(pkg);
    ContentHandler handler = new XSSFSheetXMLHandler(styles, sharedStrings, mySheetContentsHandler, true);
    
    XMLReader parser = XMLReaderFactory.createXMLReader();
    parser.setContentHandler(handler);
    parser.parse(new InputSource(reader.getSheetsData().next()));
    

    Where mySheetsContentHandler should be your own implementation of XSSFSheetXMLHandler.SheetContentsHandler. This class will be fed rows and cells.

    Note however that this can be moderately memory consuming if your shared strings table is huge (which happens if you don't have any duplicate strings in your huge sheets). If memory is still a problem, I recommend using the raw XML streams (also provided by XSSFReader).

    0 讨论(0)
提交回复
热议问题