pdfbox: how to clone a page

拥有回忆 提交于 2019-12-05 11:26:41

The least resource intensive way to clone a page is a shallow copy of the corresponding dictionary:

PDDocument doc = PDDocument.load( file );

List<PDPage> allPages = doc.getDocumentCatalog().getAllPages();

PDPage page = allPages.get(0);
COSDictionary pageDict = page.getCOSDictionary();
COSDictionary newPageDict = new COSDictionary(pageDict);

newPageDict.removeItem(COSName.ANNOTS);

PDPage newPage = new PDPage(newPageDict);
doc.addPage(newPage);

doc.save( outfile );

I explicitly deleted the annotations (form fields etc) of the copy because an annotation has a reference pointing back to its page which in the copied page obviously is wrong.

Thus, if you want the annotations to come along in a clean way, you have to create shallow copies of the annotations array and all contained annotation dictionaries, too, and replace the page reference therein.

Most PDF reader would not mind, though, if the page references are incorrect. For a dirty solution, therefore, you could simply leave the annotations in the page dictionary. But who wants to be dirty... ;)

If you want to additionally change some parts of the new or the old page, you obviously also have to copy the respective PDF objects before manipulating them.

Some other remarks:

Your original page cloning to me looks weird. After all you add the identical page dictionary to the document again (duplicate entries in the page tree are ignored, I think) and then do some merge between these identical page objects.

I assume the PDFCloneUtility is meant for cloning between different documents, not inside the same, but merging a dictionary into itself does not need to work.

I would like to get a reference to all the PDFields for any form fields in this newly cloned page

As the fields have the same name, they are identical!

Fields in PDF are abstract fields which can have many appearances spread over the document. The same name implies the same field.

A field appearing on some page means that there is an annotation representing that field on the page. To make things more complicated, field dictionary and annotation dictionary can be merged for fields with one appearance only.

Thus, depending on your requirements you will first have to decide whether you want to work with fields or with field annotations.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!