ITextSharp Find coordinates of specific text in PDF

荒凉一梦 提交于 2019-11-29 16:59:55

You can make use of the parser package of iText (Sharp) to find the position of a given text. You do have to implement your own RenderListener, though, as the main use case of that package is text extraction, not text position finding.

It is not as easy as you might think as e.g. the individual characters of the words might come in separately in any order.

PS:

First you will have to find out, though, whether the line for the signature consists of characters (as your question seems to imply) or whether it is a drawn path. Additionally you will have to find out whether that line is unique in the document.

In the former case, the RenderListener implementation you need has to inspect the TextRenderInfo objects forwarded for processing in its RenderText method. If its text content contains those unique characters building the signatrue line, you have to store the position data of this TextRenderInfo. If the line characters are not unique, you will have to find some additional criteria making them unique, e.g. some preceding string or possibly a fact that its the last occurance of those characters in the document.

In the latter case the parser package functionality has to be somewhat extended as it currently does not report paths. According to the iText mailing list, an extension like that is on the ToDo list.

Mike Varosky

This question isn't directly related to what you want to accomplish, but it is indirectly related

JCIS posted a great application that shows you the very arduous task of locating specific text, albeit with VB. It wouldn't be as simple as plugging it into a vb > c# converter, but it should be translatable. This may seem like an easy task to accomplish you might think, but PDF is not a document format, it's a display format technically. The difference between those 2 is what makes this such a long process.

Hyangyong Lee

First, in case just words are english , you can find parse easily, but when your documents is not english language, you should understand the font of your language exactally UNICODE.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!