Alter PDF - Text repositioning

自闭症网瘾萝莉.ら 提交于 2019-12-01 06:26:02

I think it can be done if all the PDF files are simple (not complex) coming from the same application.
If you need this for e.g. a website where users can upload files, then better forget it: you'll never get a solution that will work perfectly with any PDF file.

PDFsharp can help - but AFAIK PDFsharp only does half of what you need. PDFsharp will give you the blocks that make up the PDF file. You have to parse the blocks to find the drawing instructions, check the positions, and relocate them.
Some applications don't even draw words, so a simple word such as "Hello" could be drawn in 3 chunks (maybe "He", "ll" and "o"). You may have to pay attention to this; maybe not if all files come from the same application.

I think the code shown here to extract text could be helpful:
http://forum.pdfsharp.net/viewtopic.php?p=4010#p4010
To relocate text you have to find it in the first place - a lot of additional work still needed ...

Muthukumar J

You can remove an object using Page.Contents.Elements.RemoveAt(8) Validate the element count by checking Page.Contents.Elements.Count.

you can get the string value of each element (to do some string validation) you can fetch the data as below.

public static string GetElementStream(PdfPage page, int elementIndex)
    {
        string strStreamValue;
        byte[] streamValue;
        strStreamValue = "";

        if (page.Contents.Elements.Count >= elementIndex)
        {
            PdfDictionary.PdfStream stream = page.Contents.Elements.GetDictionary(elementIndex).Stream;
            streamValue = stream.Value;

            foreach (byte b in streamValue)
            {
                strStreamValue += (char)b;
            }
        }
        return strStreamValue;
    }

Or you could draw over and create a read only text form at the new location

If a commercial library instead of PDFSharp is an option, you could try Amyuni PDF Creator .Net or Amyuni PDF Creator ActiveX. The method IacDocument.GetObjectsInRectangle allows you to retrieve all the "graphic objects" of the specified rectangle, then you could add certain value to each x and/or y coordinate to move those objects around the page. From the documentation:

IacDocument.GetObjectsInRectangle Method

The GetObjectsInRectangle method gets all the objects that are in the specified rectangle.

Usual disclaimer applies.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!