Huge memory Allocation when using EPPlus Excel Library

孤街醉人 提交于 2019-12-01 03:13:02

问题


Context

I have been using EPPLUS as my tool to automate excel report generation, using C# as the client language of the library.

Problem:

After trying to write a really big report (response of a SQL Query), with pivot tables, charts and so forth, i end up having a Out of Memory Exception.

TroubleShooting

In order to troubleshoot, i decided to open an existing report that has 138MB, and use the GC object to try to take a peek on what's happening with my memory, and here are the results.

ExcelPackage pkg = new ExcelPackage (new FileInfo (@"PATH TO THE REPORT.xlsx"));
ExcelWorkbook wb = pkg.Workbook;

Garbage Collection Results, before the second line of code, and after.

So, i have no idea what to do from now on. All i am doing is opening the report, which is consuming roughtly 10 (9.98 actually) times the report size itself, on memory.

The ~138MB of the excel file, takes up 1.370.817.264 bytes of RAM.

Update One:

There's a fairly recent beta version of EPPlus that's out that has on it's changelog:

New Cell store
* Less memory consumtion
* Insert columns (not on the range level)
* Faster row inserts

After updating the Nuget, i still have the same exception, that is thrown after the first line, instead of being raised on the second line.


回答1:


Modern Excel files, ie, Xlsx files are zip-compressed, and often achieve compression down to 10%. I just uncompressed a 1.6MB file I generated using a similar tool and found it extracted to 18.8 MB of data.

You've got a 0.138 GB file that is using 1.370 GB of memory, which is almost exactly 10%. The uncompressed representation in memory is what is eating your memory.

If you're curious, you can use a tool like 7-Zip to extract the Xlsx files, or you can rename the file to end in .zip and browse it in Windows.




回答2:


As I've encountered this too, and found no real solution, I've had to come up with the solution by myself. It comes as a new library: https://github.com/danielgindi/SpreadsheetStreams.net

This is based on taking a very old piece of code of mine, that supported csv and xml, refactor the interface, add xlsx support, and publish as a standalone library.

This is not a replacement for EPPlus or other spreadsheet manipulation libraries, this one is just about streaming generation of reports. Not all excel features are there also.



来源:https://stackoverflow.com/questions/24562034/huge-memory-allocation-when-using-epplus-excel-library

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!