How to program a text search and replace in PDF files

后端 未结 8 802
不思量自难忘°
不思量自难忘° 2020-12-13 02:41

How would I be able to programmatically search and replace some text in a large number of PDF files? I would like to remove a URL that has been added to a set of files. I

相关标签:
8条回答
  • 2020-12-13 03:09

    Finding text in a PDF can be inherently hard because of the graphical nature of the document format -- the letters you are searching for may not be contiguous in the file. That said, CAM::PDF has some search-replace capabilities and heuristics. Give changepagestring.pl a try and see if it works on your PDFs.

    0 讨论(0)
  • 2020-12-13 03:14

    I suggest you may use VeryPDF PDF Text Replacer Command Line software to batch replace text in PDF pages, you can run pdftr.exe to replace text in PDF pages easily, for example,

    pdftr.exe -contentreplace "My Name=>Your Name" D:\in.pdf D:\out.pdf

    pdftr.exe -searchandoverlaytext "My Name=>Your Name" D:\in.pdf D:\out.pdf

    pdftr.exe -searchandoverlaytext "My Name=>D:\temp\myname.png*20*20" D:\in.pdf D:\out.pdf

    pdftr.exe -pagerange 1-3 -contentreplace "Old Text=>New Text||VeryPDF=>VeryDOC||My Name=>Your Name" D:\in.pdf D:\out.pdf

    pdftr.exe -searchtext "string" C:\in.pdf

    pdftr.exe -pagerange 1 -searchtext "string" C:\in.pdf

    pdftr.exe -pagerange 1 -searchandoverlaytext "Old Text=>New Text||VeryPDF=>VeryDOC||My Name=>Your Name" D:\in.pdf D:\out.pdf

    pdftr.exe -overlaytextfontname "Arial" -overlaytextcolor FF0000 -overlaybgcolor 00FF00 -searchandoverlaytext "Old Text=>New Text||VeryPDF=>VeryDOC||My Name=>Your Name" D:\in.pdf D:\out.pdf

    pdftr.exe -opw 123 -upw 456 -contentreplace "Old Text=>New Text||VeryPDF=>VeryDOC||My Name=>Your Name" D:\in.pdf D:\out.pdf

    pdftr.exe -searchandoverlaytext "PDFcamp Printer=>VeryPDF Printer" -overlaytextfontsize 8 D:\in.pdf D:\out.pdf

    pdftr.exe -searchandoverlaytext "PDFcamp Printer=>VeryPDF Printer" -overlaytextfontsize 80% D:\in.pdf D:\out.pdf

    0 讨论(0)
  • 2020-12-13 03:17

    You can use the 'redaction' feature in Adobe Acrobat Pro to find & replace all references in a single document in one step...not sure if it can be automated to multiple steps.

    http://help.adobe.com/en_US/Acrobat/9.0/Professional/WS5E28D332-9FF7-4569-AFAD-79AD60092D4D.w.html

    0 讨论(0)
  • 2020-12-13 03:18

    I just finished trying out infix for a text that is comprised of text ladened with diacritics with the hope of generating another text where characters with double and composed diacritics are replaced by alternate with single diacritics. Infix is such definitely a good solution for someone who does not care for the trouble of understanding the working of programmatic solutions. All the request changes were effected. Still need to understand how to effect reflow of words that change the layout of text.

    0 讨论(0)
  • 2020-12-13 03:23

    The question is for a programmatic solution, but I will still share this free online tool which helped me mass replace text in some PDF files:

    http://www.pdfdu.com/pdf-replace-text.aspx

    I did not notice any ads or other modifications in the resulting PDF files after replacing the text.

    I was not able to make the changes locally with the software I tried. I think the main problem was that I was missing the font used in the PDF and it did not work properly, even with Acrobat Pro. The online tool did not complain and produced a great result.

    0 讨论(0)
  • 2020-12-13 03:26

    This is just half a solution but I used Touch up combined with AppleScript's support for sending keystrokes to replace a string in thousands of table cells. Depending on how your pages are layout it could work for you. In my case I had to manually insert the cursor in the beginning of every table (tens of tables - quite manageable for a manual process) but after that i replaced thousands of cells automatically.

    0 讨论(0)
提交回复
热议问题