Batch fill PDF forms from python or bash

后端 未结 3 1275
佛祖请我去吃肉
佛祖请我去吃肉 2020-12-08 03:37

I have a PDF form that needs to be filled out a bunch of times (it\'s a timesheet to be exact). Now since I don\'t want to do this by hand, I was looking for a way to fill t

3条回答
  •  南方客
    南方客 (楼主)
    2020-12-08 04:03

    For Python you'll need the fdfgen lib and pdftk

    @Hugh Bothwell's comment is 100% correct so I'll extend that answer with a working implementation.

    If you're in windows you'll also need to make sure both python and pdftk are contained in the system path (unless you want to use long folder names).

    Here's the code to auto-batch-fill a collection of PDF forms from a CSV data file:

    import csv
    from fdfgen import forge_fdf
    import os
    import sys
    
    sys.path.insert(0, os.getcwd())
    filename_prefix = "NVC"
    csv_file = "NVC.csv"
    pdf_file = "NVC.pdf"
    tmp_file = "tmp.fdf"
    output_folder = './output/'
    
    def process_csv(file):
        headers = []
        data =  []
        csv_data = csv.reader(open(file))
        for i, row in enumerate(csv_data):
          if i == 0:
            headers = row
            continue;
          field = []
          for i in range(len(headers)):
            field.append((headers[i], row[i]))
          data.append(field)
        return data
    
    def form_fill(fields):
      fdf = forge_fdf("",fields,[],[],[])
      fdf_file = open(tmp_file,"w")
      fdf_file.write(fdf)
      fdf_file.close()
      output_file = '{0}{1} {2}.pdf'.format(output_folder, filename_prefix, fields[1][1])
      cmd = 'pdftk "{0}" fill_form "{1}" output "{2}" dont_ask'.format(pdf_file, tmp_file, output_file)
      os.system(cmd)
      os.remove(tmp_file)
    
    data = process_csv(csv_file)
    print('Generating Forms:')
    print('-----------------------')
    for i in data:
      if i[0][1] == 'Yes':
        continue
      print('{0} {1} created...'.format(filename_prefix, i[1][1]))
      form_fill(i)
    

    Note: It shouldn't be rocket-surgery to figure out how to customize this. The initial variable declarations contain the custom configuration.

    In the CSV, in the first row each column will contain the name of the corresponding field name in the PDF file. Any columns that don't have corresponding fields in the template will be ignored.

    In the PDF template, just create editable fields where you want your data to fill and make sure the names match up with the CSV data.

    For this specific configuration, just put this file in the same folder as your NVC.csv, NVC.pdf, and a folder named 'output'. Run it and it automagically does the rest.

提交回复
热议问题