read, highlight, save PDF programmatically

后端 未结 3 915
挽巷
挽巷 2021-01-04 11:37

I\'d like to write a small script (which will run on a headless Linux server) that reads a PDF, highlights text that matches anything in an array of strings that I pass, the

3条回答
  •  长发绾君心
    2021-01-04 11:52

    PDFlib has Python bindings and supports these operations. You will want with PDI if you want to open a PDF. http://www.pdflib.com/products/pdflib-family/pdflib-pdi/ and TET.

    Unfortunately, it is a commercial product. I have used this library in production in the past and it works great. The bindings are very functional and not so Python. I have seen some attempts to make them more Pythonic: https://github.com/alexhayes/pythonic-pdflib You will want to use: open_pdi_document().

    It sounds like you will want to do search highlighting of some sort:

    http://www.pdflib.com/tet-cookbook/tet-and-pdflib/highlight-search-terms/

提交回复
热议问题