发表新帖

发表新帖

Detect and alter strings in PDFs

后端未结

关注

 2  453

甜味超标 2020-12-11 04:34

I want to be able to detect a pattern in a PDF and somehow flag it.

For instance, in this PDF, there\'s the string *2. I want to be able to parse the PD

2条回答

既然无缘 (楼主)

2020-12-11 04:43

This is non-trivial. The problem is that PDF files are not meant to be "updated" on anything less than a page. You basically have to parse the page, adjust the PostScript rendering, and then write it back out. I don't think PyPDF has the support for doing what you want.

If "all" you want to do is to add highlighting you can probably just use the annotation dictionary. See the PDF specification for more information.

You might be able to do this using pyPDF2 but I haven't looked into it closely.

0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...

热议问题