Compress large file using SharpZipLib causing Out Of Memory Exception

大城市里の小女人 提交于 2020-01-24 23:59:27

问题


I have a 453MB XML file which I'm trying to compress to a ZIP using SharpZipLib.

Below is the code I'm using to create the zip, but it's causing an OutOfMemoryException. This code successfully compresses a file of 428MB.

Any idea why the exception is happening, as I can't see why, as my system has plenty of memory available.

public void CompressFiles(List<string> pathnames, string zipPathname)
{
    try
    {
        using (FileStream stream = new FileStream(zipPathname, FileMode.Create, FileAccess.Write, FileShare.None))
        {
            using (ZipOutputStream stream2 = new ZipOutputStream(stream))
            {
                foreach (string str in pathnames)
                {
                    FileStream stream3 = new FileStream(str, FileMode.Open, FileAccess.Read, FileShare.Read);
                    byte[] buffer = new byte[stream3.Length];
                    try
                    {
                        if (stream3.Read(buffer, 0, buffer.Length) != buffer.Length)
                        {
                            throw new Exception(string.Format("Error reading '{0}'.", str));
                        }
                    }
                    finally
                    {
                        stream3.Close();
                    }
                    ZipEntry entry = new ZipEntry(Path.GetFileName(str));
                    stream2.PutNextEntry(entry);
                    stream2.Write(buffer, 0, buffer.Length);
                }
                stream2.Finish();
            }
        }
    }
    catch (Exception)
    {
        File.Delete(zipPathname);
        throw;
    }
}

回答1:


You're trying to create a buffer as big as the file. Instead, make the buffer a fixed size, read some bytes into it, and write the number of read bytes into the zip file.

Here's your code with a buffer of 4096 bytes (and some cleanup):

public static void CompressFiles(List<string> pathnames, string zipPathname)
{
    const int BufferSize = 4096;
    byte[] buffer = new byte[BufferSize];

    try
    {
        using (FileStream stream = new FileStream(zipPathname,
            FileMode.Create, FileAccess.Write, FileShare.None))
        using (ZipOutputStream stream2 = new ZipOutputStream(stream))
        {
            foreach (string str in pathnames)
            {
                using (FileStream stream3 = new FileStream(str,
                    FileMode.Open, FileAccess.Read, FileShare.Read))
                {
                    ZipEntry entry = new ZipEntry(Path.GetFileName(str));
                    stream2.PutNextEntry(entry);

                    int read;
                    while ((read = stream3.Read(buffer, 0, buffer.Length)) > 0)
                    {
                        stream2.Write(buffer, 0, read);
                    }
                }
            }
            stream2.Finish();
        }
    }
    catch (Exception)
    {
        File.Delete(zipPathname);
        throw;
    }
}

Especially note this block:

const int BufferSize = 4096;
byte[] buffer = new byte[BufferSize];
// ...
int read;
while ((read = stream3.Read(buffer, 0, buffer.Length)) > 0)
{
    stream2.Write(buffer, 0, read);
}

This reads bytes into buffer. When there are no more bytes, the Read() method returns 0, so that's when we stop. When Read() succeeds, we can be sure there is some data in the buffer but we don't know how many bytes. The whole buffer might be filled, or just a small portion of it. Therefore, we use the number of read bytes read to determine how many bytes to write to the ZipOutputStream.

That block of code, by the way, can be replaced by a simple statement that was added to .Net 4.0, which does exactly the same:

stream3.CopyTo(stream2);

So, your code could become:

public static void CompressFiles(List<string> pathnames, string zipPathname)
{
    try
    {
        using (FileStream stream = new FileStream(zipPathname,
            FileMode.Create, FileAccess.Write, FileShare.None))
        using (ZipOutputStream stream2 = new ZipOutputStream(stream))
        {
            foreach (string str in pathnames)
            {
                using (FileStream stream3 = new FileStream(str,
                    FileMode.Open, FileAccess.Read, FileShare.Read))
                {
                    ZipEntry entry = new ZipEntry(Path.GetFileName(str));
                    stream2.PutNextEntry(entry);

                    stream3.CopyTo(stream2);
                }
            }
            stream2.Finish();
        }
    }
    catch (Exception)
    {
        File.Delete(zipPathname);
        throw;
    }
}

And now you know why you got the error, and how to use buffers.




回答2:


You're allocating a lot of memory for no good reason, and I bet you have a 32-bit process. 32-bit processes can only allocate up to 2GB of virtual memory in normal conditions, and the library surely allocates memory too.

Anyway, several things are wrong here:

  • byte[] buffer = new byte[stream3.Length];

    Why? You don't need to store the whole thing in memory to process it.

  • if (stream3.Read(buffer, 0, buffer.Length) != buffer.Length)

    This one is nasty. Stream.Read is explicitly allowed to return less bytes than what you asked for, and this is still a valid result. When reading a stream into a buffer you have to call Read repeatedly until the buffer is filled or the end of the stream is reached.

  • Your variables should have more meaningful names. You can easily get lost with these stream2, stream3 etc.

A simple solution would be:

using (var zipFileStream = new FileStream(zipPathname, FileMode.Create, FileAccess.Write, FileShare.None))
using (ZipOutputStream zipStream = new ZipOutputStream(zipFileStream))
{
    foreach (string str in pathnames)
    {
        using(var itemStream = new FileStream(str, FileMode.Open, FileAccess.Read, FileShare.Read))
        {
            var entry = new ZipEntry(Path.GetFileName(str));
            zipStream.PutNextEntry(entry);
            itemStream.CopyTo(zipStream);
        }
    }
    zipStream.Finish();
}


来源:https://stackoverflow.com/questions/28235804/compress-large-file-using-sharpziplib-causing-out-of-memory-exception

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!