merging pdf and preserve SetTagged

余生长醉 提交于 2020-01-25 23:04:13

问题


I'm using iTextSharp 5.x. I'm trying to merge two pdfs and preserve the isTagged flag. When I remove copy.SetTagged(); the result pdf contains both pdfs which is great. When adding the copy.SetTagged() is get an exception

Exception -->System.ObjectDisposedException: Cannot access a closed file.
at System.IO.__Error.FileNotOpen()
at System.IO.FileStream.get_Position()

Here is the code

List<string> filesToMerge = new List<string> { "C:/dev/dcs/wp-cla-dcs/Hex/Docs/metadata/coverPage.pdf", "C:/dev/dcs/wp-cla-dcs/Hex/Docs/metadata/49W7a.pdf" };
string outputFileName = "C:/dev/dcs/wp-cla-dcs/Hex/Docs/metadata/results.pdf";

using (FileStream outFS = new FileStream(outputFileName, FileMode.Create))
using (Document document = new Document())
//  using (PdfCopy copy = new PdfCopy(document, outFS))
using (PdfCopy copy = new PdfSmartCopy(document, outFS))
{
    {
        copy.SetTagged();
        // Set up the iTextSharp document
        document.Open();
        foreach (string pdfFile in filesToMerge)
        {
            using (var reader = new PdfReader(pdfFile))
            {
                copy.AddDocument(reader);
                copy.FreeReader(reader);
            }
        }

    }
}

回答1:


despite @bruno-lowagie's comment, I have had better results doing this with with iText5.

Uisng iText7, PdfMerger left several contents untagged (all were tagged in the source document). PdfCopy in iText5 however worked just fine, only needed to manually add Xmp metadata, title, lang, etc:

public static void CombineMultiplePDFs(string[] fileNames, string outFile)
{
    var lang = "en";
    var title = "My new title";

    // step 1: creation of a document-object
    Document document = new Document();

    // step 2: we create a writer that listens to the document
    FileStream newFileStream = new FileStream(outFile, FileMode.Create);
    PdfCopy writer = new PdfCopy(document, newFileStream);

    writer.SetTagged();

    writer.PdfVersion = PdfWriter.VERSION_1_7;
    writer.AddViewerPreference(PdfName.DISPLAYDOCTITLE, new PdfBoolean(true));
    writer.Info.Put(PdfName.TITLE, new PdfString(title));
    writer.CreateXmpMetadata();

    // step 3: we open the document
    document.Open();

    // set meta data
    document.AddLanguage(lang);
    document.AddTitle(title);

    // keep an array of all open readers so they can be closed again.
    var readers = new PdfReader[fileNames.Length];
    for (var fi = 0; fi < fileNames.Length; fi++)
    {
        // we create a reader for a certain document
        var fileName = fileNames[0];
        PdfReader reader = new PdfReader(fileName);
        readers[fi] = reader;
        reader.ConsolidateNamedDestinations();

        // step 4: we add content
        for (int i = 1; i <= reader.NumberOfPages; i++)
        {
            // IMPORTANT: the third param is is "KeepTaggedPdfStructure"
            PdfImportedPage page = writer.GetImportedPage(reader, i, true);
            writer.AddPage(page);
        }
    }

    // step 5: we close the document and writer
    writer.Close();
    document.Close();

    // close readers only after document is lcosed
    foreach (var r in readers)
    {
        r.Close();
    }
}


来源:https://stackoverflow.com/questions/47209267/merging-pdf-and-preserve-settagged

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!