Pdfsharp Out of Memory Exception when Combine Multi Pdf File

∥☆過路亽.° 提交于 2019-12-08 14:12:21

问题


I have to convert into a single pdf a large number (but undefined) pdf into one for this, I'm using the code PDFsharp here.

    // Get some file names
    string[] files = filesToPrint.ToArray();

    // Open the output document
    PdfDocument outputDocument = new PdfDocument();

    PdfPage newPage; 

    int nProcessedFile = 0;
    int nMemoryFile = 5;
    int nStepConverted = 0;
    String sNameLastCombineFile = ""; 


    // Iterate files
    foreach (string file in files)
    {
        // Open the document to import pages from it.
        PdfDocument inputDocument = PdfReader.Open(file, PdfDocumentOpenMode.Import);

        // Iterate pages
        int count = inputDocument.PageCount;
        for (int idx = 0; idx < count; idx++)
        {
            // Get the page from the external document...
            PdfPage page = inputDocument.Pages[idx];
            // ...and add it to the output document.
            outputDocument.AddPage(page);                                
        }

        nProcessedFile++;
        if (nProcessedFile >= nMemoryFile)
        {
            //nProcessedFile = 0;
            //nStepConverted++;
            //sNameLastCombineFile = "ConcatenatedDocument" + nStepConverted.ToString() + " _tempfile.pdf";

            //outputDocument.Save(sNameLastCombineFile);
            //outputDocument.Close();                 
        }
    }
    // Save the document...
    const string filename = "ConcatenatedDocument1_tempfile.pdf";
    outputDocument.Save(filename);
    // ...and start a viewer.
   Process.Start(filename);

For small numbers of files the code works but then at some point generates an exception of out of memory

is there a solution?

p.s I was thinking of saving the files in step and then the remaining aggiungingere so liebrare memory but I can not find the way.

UPDATE1:

if (nProcessedFile >= nMemoryFile)
{
nProcessedFile = 0;
//nStepConverted++;
sNameLastCombineFile = "ConcatenatedDocument" + nStepConverted.ToString() + " _tempfile.pdf";

outputDocument.Save(sNameLastCombineFile);
outputDocument.Close();

outputDocument = PdfReader.Open(sNameLastCombineFile,PdfDocumentOpenMode.Modify);
}

UPDATE 2 versione 1.32 Complete example Error on line: PdfDocument inputDocument = PdfReader.Open(file, PdfDocumentOpenMode.Import);

Text error: Cannot handle iref streams. The current implementation of PDFsharp cannot handle this PDF feature introduced with Acrobat 6.

using PdfSharp.Pdf;
using PdfSharp.Pdf.IO;
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            List<String> filesToPrint = new List<string>();

            filesToPrint = Directory.GetFiles(@"D:\Downloads\RACCOLTA\FILE PDF", "*.pdf").ToList();

            // Get some file names
            string[] files = filesToPrint.ToArray();

            // Open the output document
            PdfDocument outputDocument = new PdfDocument();

            PdfPage newPage;

            int nProcessedFile = 0;
            int nMemoryFile = 5;
            int nStepConverted = 0;
            String sNameLastCombineFile = "";

            try
            {
                // Iterate files
                foreach (string file in files)
                {
                    // Open the document to import pages from it.
                    PdfDocument inputDocument = PdfReader.Open(file, PdfDocumentOpenMode.Import);

                    // Iterate pages
                    int count = inputDocument.PageCount;
                    for (int idx = 0; idx < count; idx++)
                    {
                        // Get the page from the external document...
                        PdfPage page = inputDocument.Pages[idx];
                        // ...and add it to the output document.
                        outputDocument.AddPage(page);
                    }

                    nProcessedFile++;
                    if (nProcessedFile >= nMemoryFile)
                    {
                        nProcessedFile = 0;
                        //nStepConverted++;
                        sNameLastCombineFile = "ConcatenatedDocument" + nStepConverted.ToString() + " _tempfile.pdf";

                        outputDocument.Save(sNameLastCombineFile);
                        outputDocument.Close();

                        inputDocument = PdfReader.Open(sNameLastCombineFile , PdfDocumentOpenMode.Modify);
                    }
                }
                // Save the document...
                const string filename = "ConcatenatedDocument1_tempfile.pdf";
                outputDocument.Save(filename);
                // ...and start a viewer.
                Process.Start(filename);

            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message);
                Console.ReadKey();

            }
        }
    }
}

UPDATE3 Code that generate exception out of memory

            int count = inputDocument.PageCount;
            for (int idx = 0; idx < count; idx++)
            {
                // Get the page from the external document...
                newPage = inputDocument.Pages[idx];
                // ...and add it to the output document.
                outputDocument.AddPage(newPage);

                newPage.Close();
            }

I can not exactly which row general exception


回答1:


I had a simular issue, saving, closing and reopening the PdfDocument did not really help.

I am adding al lot (100+) large (upto 5Mb) images (tiff, jpg, etc) to a pdf document where every images has its own page. It crashed around image #50. After the save-close-reopen it did finish the whole document but was still getting close to max memory, around 3Gb. Some more images and it would still crash.

After more refining, I implemented a using for the XGraphics object, it was a little better again but not much.

The big step forward was disposing of the XImage within the loop! After that the application never used more than 100-200Kb, I removed the save-close-reopen for the PdfDocument and it was no problem.




回答2:


After saving and closing outputDocument (the code is commented out in your snippet), you have to open outputDocument again, using PdfDocumentOpenMode.Modify.

It could help to add using(...) for the inputDocument.

If your code is running as a 32-bit process, then switching to 64 bit will allow your process to use more than 2 GB of RAM (assuming your computer has more than 2 GB RAM).

Update: The message "Cannot handle iref streams" means you have to use PDFsharp 1.50 Prerelease, available on NuGet.



来源:https://stackoverflow.com/questions/33607616/pdfsharp-out-of-memory-exception-when-combine-multi-pdf-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!