Why is data added to the PDF content stream?

自古美人都是妖i 提交于 2020-01-15 10:43:09

问题


When using this code (Removing Watermark from PDF iTextSharp) to simply read and re-write the content stream for an identical PDF, I get additional operations added to the content stream for this file.

Before Content Stream

q
 q
/I0 Do
Q

Q
 q
10 0 0 10 0 0 cm
0.1 0 0 0.1 0 0 cm
/QuickPDFXO6d1c5c37 Do
Q

After Content Stream

q
0 -1 1 0 0 1224 cm
q
q
/I0 Do
Q
Q
q
10 0 0 10 0 0 cm
0.1 0 0 0.1 0 0 cm
/QuickPDFXO6d1c5c37 Do
Q
Q

Any idea why this was appended to my content stream?

q
0 -1 1 0 0 1224 cm
....
Q

My Code is similar to the article linked except that I'm trying to remove certain items from the content stream.

XObjectRemover editor = new XObjectRemover();
List<List<PdfContentData>> output = editor.EditPageContent(stamper, pgNumber);
PdfContentByte content = stamper.GetUnderContent(pgNumber);

foreach (List<PdfContentData> bracketList in output)
{
    foreach (PdfContentData operandList in bracketList)
    {
        if (operandList.operandToDelete == false)
        {
            int index = 0;
            foreach (PdfObject op in operandList.pdfOperands)
            {
                op.ToPdf(content.PdfWriter, content.InternalBuffer);
                content.InternalBuffer.Append(operandList.pdfOperands.Count > ++index ? (byte)' ' : (byte)'\n');
            }
        }
    }
}

The PdfContentData class is just a collection of all the content operations with some flagged for delete.

public class PdfContentData
{
    public int opNumber { get; set; }
    public PdfLiteral pdfOperator { get; set; }
    public List<PdfObject> pdfOperands { get; set; }
    public bool operandToDelete { get; set; }

    public PdfContentData(int opNum, PdfLiteral op, List<PdfObject> ops)
    {
        this.opNumber = opNum;
        this.pdfOperator = op;
        this.pdfOperands = ops;
    }

    public override string ToString()
    {
        return $"Ops: [{string.Join(",", pdfOperands.Select(p => p.ToString()).ToArray())}]   Del: [{operandToDelete}]";
    }
}

and XObjectRemover is just a class that is derived from PdfContentStreamEditor, just like TransparentGraphicsRemover in @mkl's example.


回答1:


This addition

q
0 -1 1 0 0 1224 cm
....
Q

rotates everything in between. Adding this is a 'service' by iText(Sharp) intended to allow you to ignore the rotation and draw stuff using more natural coordinates.

Unfortunately this service does not makes sense for the task at hand. Thus, you should switch it of.

The PdfStamper has a flag allowing you to do just that:

/** Checks if the content is automatically adjusted to compensate
 * the original page rotation.
 * @return the auto-rotation status
 */    
/** Flags the content to be automatically adjusted to compensate
 * the original page rotation. The default is <CODE>true</CODE>.
 * @param rotateContents <CODE>true</CODE> to set auto-rotation, <CODE>false</CODE>
 * otherwise
 */    
virtual public bool RotateContents {
    set {
        stamper.RotateContents = value;
    }
    get {
        return stamper.RotateContents;
    }
} 

(The comments are Javadoc comments originally associated with a separate getter and setter for this attribute. Thus, this double comment.)

Thus, I would propose setting RotateContent to false.



来源:https://stackoverflow.com/questions/39351092/why-is-data-added-to-the-pdf-content-stream

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!