how to split Excel file using java?

匿名 (未验证) 提交于 2019-12-03 10:10:24

问题:

I have Excel sheet with 200000 rows.i want to splits the excel file for each 50000 Records. i am using Apache POI API To read and write Excel file.Is it possible to split file if number of row reaches on defined Record size.please help me to get the solution for that problem.

Code:

public String[][] getSheetData(int SheetIndex)  {     int noOfColumns = 0;XSSFRow row = null;     XSSFCell cell = null;     int i=0;int noOfRows=0;     int j=0;     String[][] data=null; XSSFSheet sheet=null;      try {                     loadFile();  //load give Excel                     if(validateIndex(SheetIndex))                     {                             sheet  = workbook.getSheetAt(SheetIndex);                             noOfColumns = getNumberOfColumns(SheetIndex);                             noOfRows =getNumberOfRows(SheetIndex)+1;                             data = new String[noOfRows][noOfColumns];                             Iterator rowIter = sheet.rowIterator();                             while(rowIter.hasNext())                             {                                 row = (XSSFRow) rowIter.next();                                 Iterator cellIter = row.cellIterator();                                 j=0;                                 while(cellIter.hasNext())                                 {                                     cell  = (XSSFCell) cellIter.next();                                     if(cell.getCellType() == cell.CELL_TYPE_STRING)                                     {                                         data[i][j] = cell.getStringCellValue();                                     }                                     else if(cell.getCellType() == cell.CELL_TYPE_NUMERIC)                                     {                                         data[i][j] = Double.toString(cell.getNumericCellValue());                                     }                                      j++;                                 }                                  i++;                             }   // outer while                       }                     else throw new InvalidSheetIndexException("Invalid sheet index.");                   } catch (Exception ex) {                     logger.error(ex);}          return data;  } 

Occur Exception:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space     at org.apache.xmlbeans.impl.store.Locale$ScrubBuffer.<init>(Locale.java:1885)     at org.apache.xmlbeans.impl.store.Locale.getScrubBuffer(Locale.java:1904)     at org.apache.xmlbeans.impl.store.Xobj.getValueAsString(Xobj.java:1205)     at org.apache.xmlbeans.impl.store.Xobj.fetch_text(Xobj.java:1796)     at org.apache.xmlbeans.impl.values.XmlObjectBase.get_wscanon_text(XmlObjectBase.java:1332)     at org.apache.xmlbeans.impl.values.XmlObjectBase.check_dated(XmlObjectBase.java:1269)     at org.apache.xmlbeans.impl.values.JavaLongHolder.longValue(JavaLongHolder.java:53)     at org.apache.xmlbeans.impl.values.XmlObjectBase.getLongValue(XmlObjectBase.java:1502)     at org.openxmlformats.schemas.spreadsheetml.x2006.main.impl.CTRowImpl.getR(Unknown Source)     at org.apache.poi.xssf.usermodel.XSSFRow.getRowNum(XSSFRow.java:321)     at org.apache.poi.xssf.usermodel.XSSFSheet.initRows(XSSFSheet.java:180)     at org.apache.poi.xssf.usermodel.XSSFSheet.read(XSSFSheet.java:147)     at org.apache.poi.xssf.usermodel.XSSFSheet.onDocumentRead(XSSFSheet.java:134)     at org.apache.poi.xssf.usermodel.XSSFWorkbook.onDocumentRead(XSSFWorkbook.java:235)     at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:190)     at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:189)     at com.arosys.excelReading.ReadXLSX.loadFile(ReadXLSX.java:49)     at com.arosys.excelReading.ReadXLSX.getNumberOfSheet(ReadXLSX.java:121)     at com.arosys.excelReading.ReadXLSX.main(ReadXLSX.java:324)     Java Result: 1 

Thanks

回答1:

If you cannot read this docuemnt with POI but the table is simple I'd suggest you to open it with Excel, save it as CSV file and then separate this file to fragments. This can be even done using shell script/batch file.

This recommendation is valid if you have one huge file and have to split it. If however you are implementing a service that should receive such excel files and separate it look for other solution.



回答2:

The default Java heap size is very small, especially for when working with large xml files.

You just need to up your memory size and you'll be fine. Set a decent size for your -Xmx setting when running Java



回答3:

  1. First of all use recommendation of Gagravarr and maximize your memory footprint. If your platform allows, switch to 64-bit JVM. This brute force approach alone may solve your problem

  2. If possible, avoid using .xlsx files - POI consumes much more memory on .xlsx than on .xls

  3. POI here recommends using eventmodel for reading (here is where most of the memory is consumed) and you also may try their new SXSSF API with small memory footprint for writing (still in beta).



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!