I am looking to write to an excel (.xls MS Excel 2003 format) file programatically using Java. The excel output files may contain ~200,000 rows which I plan to split over n
I had to split my files into several excel files in order to overcome the heap space exception. I figured that around 5k rows with 22 columns was about it, so I just made my logic so that every 5k row I would end the file, start a new one and just numerate the files accordingly.
In the cases where I had 20k + rows to be written I would have 4+ different files representing the data.
Is this memory issue happen when you insert data into cell, or when you perform data computation/generation?
If you are going to load files into an excel that consist of predefined static template format, then better to save a template and reuse multiple time. Normally template cases happen when you are going to generate daily sales report or etc...
Else, every time you need to create new row, border, column etc from scratch.
So far, Apache POI is the only choice I found.
"Clearly, writing ~20k rows(with some 10-20 columns in each row) gives me the dreaded "java.lang.OutOfMemoryError: Java heap space"."
"Enterprise IT"
What YOU CAN DO is- perform batch data insertion. Create a queuetask table, everytime after generate 1 page, rest for seconds, then continue second portion. If you are worry about the dynamic data changes during your queue task, you can first get the primary key into the excel (by hiding and lock the column from user view). First run will be insert primary key, then second queue run onwards will read out from notepad and do the task portion by portion.
There also is JExcelApi, but its uses more memory. i think you should create .csv file and open it in excel. it allows you to pass a lot of data, but you wont be able to do any "excel magic".
Try to use SXSSF workbook, thats great thing for huge xls documents, its build document and don't eat RAM at all, becase using nio
We did something quite similar, same amount of data, and we had to switch to JExcelapi because POI is so heavy on resources. Try JexcelApi, you won't regret it when you have to manipulate big Excel-files!
We developed a java library for this purpose and currently it is available as open source project https://github.com/jbaliuka/x4j-analytic . We use it for operational reporting. We generate huge Excel files, ~200,000 should work without problems, Excel manages to open such files too. Our code uses POI to load template but generated content is streamed directly to file without XML or Object model layer in memory.