Append mode requires a document without errors, even if recovery is possible

只谈情不闲聊 提交于 2020-01-13 07:07:35

问题


The PDF which I signed with append mode is exported from Office Word 2016.

Here is my file : word.pdf

And I got this error message:

com.itextpdf.kernel.PdfException: Append mode requires a document without errors, even if recovery is possible.

I am using iText7 7.0.4 .


回答1:


The document you are trying to change in append mode is broken. Most likely, the byte offsets as defined in the cross-reference table don't correspond with the actual byte positions of the PDF objects.

In your case, I see something strange at the end of the file:

xref
0 26
0000000010 65535 f
0000000017 00000 n
0000000166 00000 n
0000000222 00000 n
0000000492 00000 n
0000000755 00000 n
0000000932 00000 n
0000001180 00000 n
0000001233 00000 n
0000001286 00000 n
0000000011 65535 f
0000000012 65535 f
0000000013 65535 f
0000000014 65535 f
0000000015 65535 f
0000000016 65535 f
0000000017 65535 f
0000000018 65535 f
0000000019 65535 f
0000000020 65535 f
0000000000 65535 f
0000001961 00000 n
0000002154 00000 n
0000044863 00000 n
0000048000 00000 n
0000048045 00000 n
trailer
<</Size 26/Root 1 0 R/Info 9 0 R/ID[<8812105F6F93284DAEF240C8C1FC4C4E><8812105F6F93284DAEF240C8C1FC4C4E>] >>
startxref
48341
%%EOF
xref
0 0
trailer
<</Size 26/Root 1 0 R/Info 9 0 R/ID[<8812105F6F93284DAEF240C8C1FC4C4E><8812105F6F93284DAEF240C8C1FC4C4E>] /Prev 48341/XRefStm 48045>>
startxref
49017
%%EOF

You have a PDF with two trailers. One trailer claims that the cross-reference table in stored in a stream:

/XRefStm 48045

While at the same time indication the start of the cross-reference table at byte position 49017:

startxref
49017

The other trailer claims that there's an uncompressed cross-reference table and that it starts at byte position 48341:

startxref
48341

And indeed: there is an uncompressed cross-reference stream:

xref
0 26
0000000010 65535 f
0000000017 00000 n

Do you understand the inconsistency in your file?

When you use append mode, iText doesn't change anything to the original document: not a single byte is changed; new bytes are added after the final %%EOF marker of the original file. However, iText refuses to do this when the original file is broken. I hope you understand the rationale: you'd make a bad situation worse if iText allowed you to do this.

To solve this problem, you need to fix the broken file first. That can be done by "manipulating" the document without changing anything, but to do this in normal mode, not in append mode.

Have you tried removing the extra trailer. I threw away:

xref
0 0
trailer
<</Size 26/Root 1 0 R/Info 9 0 R/ID[<8812105F6F93284DAEF240C8C1FC4C4E><8812105F6F93284DAEF240C8C1FC4C4E>] /Prev 48341/XRefStm 48045>>
startxref
49017
%%EOF

Adobe Reader didn't complain after removing these bytes.




回答2:


This actually is a bug in iText 7 creating an incremental update to a hybrid-reference file.

The error situation

Unfortunately the description in the question did not clearly describe a way to reproduce the error. Thus, the error can be reproduced like this:

  1. Stamping the OP's sample document in append mode (it does not need to be a signing use case).

    This step does not yet create the error in question.

  2. Stamping the output of step 1 again in append mode (again it does not need to be for signing).

    In this step the exception

    com.itextpdf.kernel.PdfException: Append mode requires a document without errors, even if recovery is possible.
    

    occurs.

The PDF in question

The OP's PDF is special as it a hybrid-reference file. According to the PDF specification (ISO 32000-1) such a file

is readable by readers designed only to support versions of PDF before PDF 1.5. Such a file contains objects referenced by standard cross-reference tables in addition to objects in object streams that are referenced by cross-reference streams.

In case of these files the startxref offset points to the start of the pre-1.5 cross reference table and the trailer XRefStm entry points to the 1.5 cross reference stream.

The PDF specification furthermore dictates that

the XRefStm entry shall not be used in the trailer dictionary of the main cross-reference section but only in an update cross-reference section.

Therefore the funny looking construction in the file:

18 0 obj
<</Type/ObjStm/N 10/First 67/Filter/FlateDecode/Length 357>>
stream
[...object stream data...]
endstream
endobj
[...]
25 0 obj
<</Type/XRef/Size 25/W[ 1 4 2] /Root 1 0 R/Info 9 0 R/ID[<8812105F6F93284DAEF240C8C1FC4C4E><8812105F6F93284DAEF240C8C1FC4C4E>] /Filter/FlateDecode/Length 97>>
stream
[...cross reference stream data...]
endstream
endobj
xref
0 26
[...cross reference table with 25 entries, objects in object stream are marked free...]
trailer
<</Size 26/Root 1 0 R/Info 9 0 R/ID[<8812105F6F93284DAEF240C8C1FC4C4E><8812105F6F93284DAEF240C8C1FC4C4E>] >>
startxref
[points to the preceding cross reference table]
48341
%%EOF
xref
0 0
[...empty incremental update cross reference table...]
trailer
[XRefStm points to the cross reference stream in object 25]
<</Size 26/Root 1 0 R/Info 9 0 R/ID[<8812105F6F93284DAEF240C8C1FC4C4E><8812105F6F93284DAEF240C8C1FC4C4E>] /Prev 48341/XRefStm 48045>>
startxref
[points to the empty incremental update cross reference table]
49017
%%EOF

Thus, while funny looking, this construction is correct.

What goes wrong

When reading the original document iText 7 recognizes that the document contains both a cross reference table and cross reference stream and chooses the cross reference stream. (Actually PdfReader.readXrefSection first reads the empty cross reference table, then finds the XRefStm entry in the trailer, and then reads the cross reference stream.)

When creating the incremental update, iText 7 remembers that the source PDF has been parsed via a cross reference stream and, therefore, uses full compression, i.e. in particular it uses an object stream and a cross reference stream.

When creating that cross reference stream, though, it sets its Prev entry to what the final startxref of the original PDF pointed to, i.e. the empty cross reference table, and not the cross reference stream it actually used.

Such mixed constructs (a cross reference stream pointing to a cross reference table as Prev) are not allowed, though.

Thus, iText in step one created an invalid cross references structure in its result document and, therefore, in step two found a broken PDF to process and complained.



来源:https://stackoverflow.com/questions/47281240/append-mode-requires-a-document-without-errors-even-if-recovery-is-possible

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!