Using itextsharp (or any c# pdf library), how to open a PDF, replace some text, and save it again?

后端 未结 3 448
孤城傲影
孤城傲影 2021-02-04 16:09

Using itextsharp (or any c# pdf library), i need to open a PDF, replace some placeholder text with actual values, and return it as a byte[].

Can someone suggest how to d

3条回答
  •  暗喜
    暗喜 (楼主)
    2021-02-04 16:40

    I have a python script here that replaces some text in a PDF:

    import re
    import sys
    import zlib
    
    # Module to find and replace text in PDF files
    #
    # Usage:
    #   python pdf_replace.py    
    #
    # @author Ionox0
    
    input_filename = sys.argv[1]
    text_to_find = sys.argv[2]
    text_to_replace = sys.argv[3]
    output_filename sys.argv[4]
    
    pdf = open(input_filename, "rb").read()
    
    # Create a copy of the PDF content to make edits to
    pdf_copy = pdf[0:]
    
    # Search for stream objects with text to replace
    stream = re.compile(r'.*?FlateDecode.*?stream(.*?)endstream', re.S)
    
    for s in stream.findall(pdf):
        s = s.strip('\r\n')
    
        try:
            text = zlib.decompress(s)
    
            if text_to_find in text:
                print('Found match:')
                print(text)
    
                text = text.replace(text_to_find, text_to_replace)
                pdf_copy = pdf_copy.replace(s, zlib.compress(text))
        except:
            pass
    
    with open(output_filename, 'wb') as out:
        out.write(pdf_copy)
    

提交回复
热议问题