VB.Net Merge multiple pdfs into one and export

后端 未结 4 495
小蘑菇
小蘑菇 2020-12-11 11:13

I have to merge multiple PDFs into a single PDF.

I am using the iText.sharp library, and collect converted the code and tried to use it (from here) The actual code

相关标签:
4条回答
  • 2020-12-11 11:26

    I have a console that monitors individual folders in a designated folder then needs to merge all of the pdf's in that folder into a single pdf. I pass an array of file paths as strings and the output file i would like.

    This is the function i use.

    Public Shared Function MergePdfFiles(ByVal pdfFiles() As String, ByVal outputPath As String) As Boolean
        Dim result As Boolean = False
        Dim pdfCount As Integer = 0     'total input pdf file count
        Dim f As Integer = 0    'pointer to current input pdf file
        Dim fileName As String
        Dim reader As iTextSharp.text.pdf.PdfReader = Nothing
        Dim pageCount As Integer = 0
        Dim pdfDoc As iTextSharp.text.Document = Nothing    'the output pdf document
        Dim writer As PdfWriter = Nothing
        Dim cb As PdfContentByte = Nothing
    
        Dim page As PdfImportedPage = Nothing
        Dim rotation As Integer = 0
    
        Try
            pdfCount = pdfFiles.Length
            If pdfCount > 1 Then
                'Open the 1st item in the array PDFFiles
                fileName = pdfFiles(f)
                reader = New iTextSharp.text.pdf.PdfReader(fileName)
                'Get page count
                pageCount = reader.NumberOfPages
    
                pdfDoc = New iTextSharp.text.Document(reader.GetPageSizeWithRotation(1), 18, 18, 18, 18)
    
                writer = PdfWriter.GetInstance(pdfDoc, New FileStream(outputPath, FileMode.OpenOrCreate))
    
    
                With pdfDoc
                    .Open()
                End With
                'Instantiate a PdfContentByte object
                cb = writer.DirectContent
                'Now loop thru the input pdfs
                While f < pdfCount
                    'Declare a page counter variable
                    Dim i As Integer = 0
                    'Loop thru the current input pdf's pages starting at page 1
                    While i < pageCount
                        i += 1
                        'Get the input page size
                        pdfDoc.SetPageSize(reader.GetPageSizeWithRotation(i))
                        'Create a new page on the output document
                        pdfDoc.NewPage()
                        'If it is the 1st page, we add bookmarks to the page
                        'Now we get the imported page
                        page = writer.GetImportedPage(reader, i)
                        'Read the imported page's rotation
                        rotation = reader.GetPageRotation(i)
                        'Then add the imported page to the PdfContentByte object as a template based on the page's rotation
                        If rotation = 90 Then
                            cb.AddTemplate(page, 0, -1.0F, 1.0F, 0, 0, reader.GetPageSizeWithRotation(i).Height)
                        ElseIf rotation = 270 Then
                            cb.AddTemplate(page, 0, 1.0F, -1.0F, 0, reader.GetPageSizeWithRotation(i).Width + 60, -30)
                        Else
                            cb.AddTemplate(page, 1.0F, 0, 0, 1.0F, 0, 0)
                        End If
                    End While
                    'Increment f and read the next input pdf file
                    f += 1
                    If f < pdfCount Then
                        fileName = pdfFiles(f)
                        reader = New iTextSharp.text.pdf.PdfReader(fileName)
                        pageCount = reader.NumberOfPages
                    End If
                End While
                'When all done, we close the document so that the pdfwriter object can write it to the output file
                pdfDoc.Close()
                result = True
            End If
        Catch ex As Exception
            Return False
        End Try
        Return result
    End Function
    
    0 讨论(0)
  • 2020-12-11 11:29

    the code that was marked correct does not close all the file streams therefore the files stay open within the app and you wont be able to delete unused PDFs within your project

    This is a better solution:

    Public Sub MergePDFFiles(ByVal outPutPDF As String) 
    
        Dim StartPath As String = FileArray(0) ' this is a List Array declared Globally
        Dim document = New Document()
        Dim outFile = Path.Combine(outPutPDF)' The outPutPDF varable is passed from another sub this is the output path
        Dim writer = New PdfCopy(document, New FileStream(outFile, FileMode.Create))
    
        Try
    
            document.Open()
            For Each fileName As String In FileArray
    
                Dim reader = New PdfReader(Path.Combine(StartPath, fileName))
    
                For i As Integer = 1 To reader.NumberOfPages
    
                    Dim page = writer.GetImportedPage(reader, i)
                    writer.AddPage(page)
    
                Next i
    
                reader.Close()
    
            Next
    
            writer.Close()
            document.Close()
    
        Catch ex As Exception
            'catch a Exception if needed
    
        Finally
    
            writer.Close()
            document.Close()
    
        End Try
    
    
    End Sub
    
    0 讨论(0)
  • 2020-12-11 11:33

    I realize I'm pretty late to the party, but after reading the comments from @BrunoLowagie, I wanted to see if I could put something together myself that uses the examples from his linked sample chapter. It's probably overkill, but I put together some code that merges multiple PDFs into a single file that I posted on the Code Review SE site (the post, VB.NET - Error Handling in Generic Class for PDF Merge, contains the full class code). It only merges PDF files right now, but I'm planning on adding methods for additional functionality later.

    The "master" method (towards the end of the Class block in the linked post, and also posted below for reference) handles the actual merging of the PDF files, but the multiple overloads provide a number of options for how to define the list of original files. So far, I've included the following features:

    • The methods return a System.IO.FileInfo object if the merge is successful.
    • Provide a System.IO.DirectoryInfo object or a System.String identifying a path and it will collect all PDF files in that directory (including sub-directories if specified) to merge.
    • Provide a List(Of System.String) or a List(Of System.IO.FileInfo) specifying the PDFs you want to merge.
    • Identify how the PDFs should be sorted before the merge (especially useful if you use one of the MergeAll methods to get all PDF files in a directory).
    • If the specified output PDF file already exists, you can specify whether or not you want to overwrite it. (I'm considering adding the "ability" to automatically adjust the output PDF file's name if it already exists).
    • Warning and Error properties provide a way to get feedback in the calling method, whether or not the merge is successful.

    Once the code is in place, it can be used like this:

    Dim PDFDir As New IO.DirectoryInfo("C:\Test Data\PDF\")
    Dim ResultFile As IO.FileInfo = Nothing
    Dim Merger As New PDFManipulator
    
    ResultFile = Merger.MergeAll(PDFDir, "C:\Test Data\PDF\Merged.pdf", True, PDFManipulator.PDFMergeSortOrder.FileName, True)
    

    Here is the "master" method. As I said, it's probably overkill (and I'm still tweaking it some), but I wanted to do my best to try to make it work as effectively as possible. Obviously it requires a Reference to the itextsharp.dll for access to the library's functions.

    I've commented out the references to the Error and Warning properties of the class for this post to help reduce any confusion.

    Public Function Merge(ByVal PDFFiles As List(Of System.IO.FileInfo), ByVal OutputFileName As String, ByVal OverwriteExistingPDF As Boolean, ByVal SortOrder As PDFMergeSortOrder) As System.IO.FileInfo
        Dim ResultFile As System.IO.FileInfo = Nothing
        Dim ContinueMerge As Boolean = True
    
        If OverwriteExistingPDF Then
            If System.IO.File.Exists(OutputFileName) Then
                Try
                    System.IO.File.Delete(OutputFileName)
                Catch ex As Exception
                    ContinueMerge = False
    
                    'If Errors Is Nothing Then
                    '    Errors = New List(Of String)
                    'End If
    
                    'Errors.Add("Could not delete existing output file.")
    
                    Throw
                End Try
            End If
        End If
    
        If ContinueMerge Then
            Dim OutputPDF As iTextSharp.text.Document = Nothing
            Dim Copier As iTextSharp.text.pdf.PdfCopy = Nothing
            Dim PDFStream As System.IO.FileStream = Nothing
            Dim SortedList As New List(Of System.IO.FileInfo)
    
            Try
                Select Case SortOrder
                    Case PDFMergeSortOrder.Original
                        SortedList = PDFFiles
                    Case PDFMergeSortOrder.FileDate
                        SortedList = PDFFiles.OrderBy(Function(f As System.IO.FileInfo) f.LastWriteTime).ToList
                    Case PDFMergeSortOrder.FileName
                        SortedList = PDFFiles.OrderBy(Function(f As System.IO.FileInfo) f.Name).ToList
                    Case PDFMergeSortOrder.FileNameWithDirectory
                        SortedList = PDFFiles.OrderBy(Function(f As System.IO.FileInfo) f.FullName).ToList
                End Select
    
                If Not IO.Directory.Exists(New IO.FileInfo(OutputFileName).DirectoryName) Then
                    Try
                        IO.Directory.CreateDirectory(New IO.FileInfo(OutputFileName).DirectoryName)
                    Catch ex As Exception
                        ContinueMerge = False
    
                        'If Errors Is Nothing Then
                        '    Errors = New List(Of String)
                        'End If
    
                        'Errors.Add("Could not create output directory.")
    
                        Throw
                    End Try
                End If
    
                If ContinueMerge Then
                    OutputPDF = New iTextSharp.text.Document
                    PDFStream = New System.IO.FileStream(OutputFileName, System.IO.FileMode.OpenOrCreate)
                    Copier = New iTextSharp.text.pdf.PdfCopy(OutputPDF, PDFStream)
    
                    OutputPDF.Open()
    
                    For Each PDF As System.IO.FileInfo In SortedList
                        If ContinueMerge Then
                            Dim InputReader As iTextSharp.text.pdf.PdfReader = Nothing
    
                            Try
                                InputReader = New iTextSharp.text.pdf.PdfReader(PDF.FullName)
    
                                For page As Integer = 1 To InputReader.NumberOfPages
                                    Copier.AddPage(Copier.GetImportedPage(InputReader, page))
                                Next page
    
                                If InputReader.IsRebuilt Then
                                    'If Warnings Is Nothing Then
                                    '    Warnings = New List(Of String)
                                    'End If
    
                                    'Warnings.Add("Damaged PDF: " & PDF.FullName & " repaired and successfully merged into output file.")
                                End If
                            Catch InvalidEx As iTextSharp.text.exceptions.InvalidPdfException
                                'Skip this file
                                'If Errors Is Nothing Then
                                '    Errors = New List(Of String)
                                'End If
    
                                'Errors.Add("Invalid PDF: " & PDF.FullName & " not merged into output file.")
                            Catch FormatEx As iTextSharp.text.pdf.BadPdfFormatException
                                'Skip this file
                                'If Errors Is Nothing Then
                                '    Errors = New List(Of String)
                                'End If
    
                                'Errors.Add("Bad PDF Format: " & PDF.FullName & " not merged into output file.")
                            Catch PassworddEx As iTextSharp.text.exceptions.BadPasswordException
                                'Skip this file
                                'If Errors Is Nothing Then
                                '    Errors = New List(Of String)
                                'End If
    
                                'Errors.Add("Password-protected PDF: " & PDF.FullName & " not merged into output file.")
                            Catch OtherEx As Exception
                                ContinueMerge = False
                            Finally
                                If Not InputReader Is Nothing Then
                                    InputReader.Close()
                                    InputReader.Dispose()
                                End If
                            End Try
                        End If
                    Next PDF
                End If
            Catch ex As iTextSharp.text.pdf.PdfException
                ResultFile = Nothing
                ContinueMerge = False
    
                'If Errors Is Nothing Then
                '    Errors = New List(Of String)
                'End If
    
                'Errors.Add("iTextSharp Error: " & ex.Message)
    
                If System.IO.File.Exists(OutputFileName) Then
                    If Not OutputPDF Is Nothing Then
                        OutputPDF.Close()
                        OutputPDF.Dispose()
                    End If
    
                    If Not PDFStream Is Nothing Then
                        PDFStream.Close()
                        PDFStream.Dispose()
                    End If
    
                    If Not Copier Is Nothing Then
                        Copier.Close()
                        Copier.Dispose()
                    End If
    
                    System.IO.File.Delete(OutputFileName)
                End If
    
                Throw
            Catch other As Exception
                ResultFile = Nothing
                ContinueMerge = False
    
                'If Errors Is Nothing Then
                '    Errors = New List(Of String)
                'End If
    
                'Errors.Add("General Error: " & other.Message)
    
                If System.IO.File.Exists(OutputFileName) Then
                    If Not OutputPDF Is Nothing Then
                        OutputPDF.Close()
                        OutputPDF.Dispose()
                    End If
    
                    If Not PDFStream Is Nothing Then
                        PDFStream.Close()
                        PDFStream.Dispose()
                    End If
    
                    If Not Copier Is Nothing Then
                        Copier.Close()
                        Copier.Dispose()
                    End If
    
                    System.IO.File.Delete(OutputFileName)
                End If
    
                Throw
            Finally
                If Not OutputPDF Is Nothing Then
                    OutputPDF.Close()
                    OutputPDF.Dispose()
                End If
    
                If Not PDFStream Is Nothing Then
                    PDFStream.Close()
                    PDFStream.Dispose()
                End If
    
                If Not Copier Is Nothing Then
                    Copier.Close()
                    Copier.Dispose()
                End If
    
                If System.IO.File.Exists(OutputFileName) Then
                    If ContinueMerge Then
                        ResultFile = New System.IO.FileInfo(OutputFileName)
    
                        If ResultFile.Length <= 0 Then
                            ResultFile = Nothing
    
                            Try
                                System.IO.File.Delete(OutputFileName)
                            Catch ex As Exception
                                Throw
                            End Try
                        End If
                    Else
                        ResultFile = Nothing
    
                        Try
                            System.IO.File.Delete(OutputFileName)
                        Catch ex As Exception
                            Throw
                        End Try
                    End If
                Else
                    ResultFile = Nothing
                End If
            End Try
        End If
    
        Return ResultFile
    End Function
    
    0 讨论(0)
  • 2020-12-11 11:39

    Some may have to make a change to the code at "writer = PdfWriter.GetInstance(pdfDoc, New FileStream(outputPath, FileMode.OpenOrCreate))" as iTextSharp may not support

    Change to:

    Dim fs As IO.FileStream = New IO.FileStream(outputPath, IO.FileMode.Create)
    
    writer = iTextSharp.text.pdf.PdfWriter.GetInstance(pdfDoc, fs)
    
    0 讨论(0)
提交回复
热议问题