Get Layer2 Text (Signature Description) from signature image using itextsharp

问题

I need to retrieve the layer2 text from a signature. How can I get the description (under the signature image) using itextsharp? below is the code I'm using to get the sign date and username:

        PdfReader reader = new PdfReader(pdfPath, System.Text.Encoding.UTF8.GetBytes(MASTER_PDF_PASSWORD));
        using (MemoryStream memoryStream = new MemoryStream())
        {
            PdfStamper stamper = new PdfStamper(reader, memoryStream);
            AcroFields acroFields = stamper.AcroFields;
            List<String> names = acroFields.GetSignatureNames();
            foreach (String name in names)
            {
                PdfPKCS7 pk = acroFields.VerifySignature(name);
                String userName = PdfPKCS7.GetSubjectFields(pk.SigningCertificate).GetField("CN");
                Console.WriteLine("Sign Date: " + pk.SignDate.ToString() + " Name: " + userName);
               // Here i need to retrieve the description underneath the signature image
            }
            reader.RemoveUnusedObjects();
            reader.Close();
            stamper.Writer.CloseStream = false;
            if (stamper != null)
            {
                stamper.Close();
            }
        }

and below is the code I used to set the description

PdfStamper st = PdfStamper.CreateSignature(reader, memoryStream, '\0', null, true);
PdfSignatureAppearance sap = st.SignatureAppearance;
sap.Render = PdfSignatureAppearance.SignatureRender.GraphicAndDescription;
sap.Layer2Font = font;
sap.Layer2Text = "Some text that i want to retrieve";

Thank you.

回答1:

While Bruno addressed the issue starting with a PDF containing a "layer 2", allow me to first state that using these "signature layers" in PDF signature appearances is not required by the PDF specification, the specification actually does not even know these layers at all! Thus, if you try to parse a specific layer, you may not find such a "layer" or, even worse, find something that looks like that layer (a XObject named n2) which contains the wrong data.

That been said, though, Whether you look for text from a layer 2 or from the signature appearance as a whole, you can use iTextSharp text extraction capabilities. I used Bruno's code as base for retrieving the n2 layer.

public static void ExtractSignatureTextFromFile(FileInfo file)
{
    try
    {
        Console.Out.Write("File: {0}\n", file);
        using (var pdfReader = new PdfReader(file.FullName))
        {
            AcroFields fields = pdfReader.AcroFields;
            foreach (string name in fields.GetSignatureNames())
            {
                Console.Out.Write("  Signature: {0}\n", name);
                iTextSharp.text.pdf.AcroFields.Item item = fields.GetFieldItem(name);
                PdfDictionary widget = item.GetWidget(0);
                PdfDictionary ap = widget.GetAsDict(PdfName.AP);
                if (ap == null)
                    continue;
                PdfStream normal = ap.GetAsStream(PdfName.N);
                if (normal == null)
                    continue;
                Console.Out.Write("    Content of normal appearance: {0}\n", extractText(normal));

                PdfDictionary resources = normal.GetAsDict(PdfName.RESOURCES);
                if (resources == null)
                    continue;
                PdfDictionary xobject = resources.GetAsDict(PdfName.XOBJECT);
                if (xobject == null)
                    continue;
                PdfStream frm = xobject.GetAsStream(PdfName.FRM);
                if (frm == null)
                    continue;
                PdfDictionary res = frm.GetAsDict(PdfName.RESOURCES);
                if (res == null)
                    continue;
                PdfDictionary xobj = res.GetAsDict(PdfName.XOBJECT);
                if (xobj == null)
                    continue;
                PRStream n2 = (PRStream) xobj.GetAsStream(PdfName.N2);
                if (n2 == null)
                    continue;
                Console.Out.Write("    Content of normal appearance, layer 2: {0}\n", extractText(n2));
            }
        }
    }
    catch (Exception ex)
    {
        Console.Error.Write("Error... " + ex.StackTrace);
    }
}

public static String extractText(PdfStream xObject)
{
    PdfDictionary resources = xObject.GetAsDict(PdfName.RESOURCES);
    ITextExtractionStrategy strategy = new LocationTextExtractionStrategy();

    PdfContentStreamProcessor processor = new PdfContentStreamProcessor(strategy);
    processor.ProcessContent(ContentByteUtils.GetContentBytesFromContentObject(xObject), resources);
    return strategy.GetResultantText();
}

For the sample file signature_n2.pdf Bruno used you get this:

File: ...\signature_n2.pdf
  Signature: Signature1
    Content of normal appearance: This document was signed by Bruno
Specimen.
    Content of normal appearance, layer 2: This document was signed by Bruno
Specimen.

As this sample uses the layer 2 as the OP expects, it already contains the text in question.

回答2:

Please take a look at the following PDF: signature_n2.pdf. It contains a signature with the following text in the n2 layer:

This document was signed by Bruno
Specimen.

Before we can write code to extract this text, we should use iText RUPS to look at the internal structure of the PDF, so that we can find out where this /n2 layer is stored:

Based on this information, we can start writing our code. See the GetN2fromSig example:

public static void main(String[] args) throws IOException {
    PdfReader reader = new PdfReader(SRC);
    AcroFields fields = reader.getAcroFields();
    Item item = fields.getFieldItem("Signature1");
    PdfDictionary widget = item.getWidget(0);
    PdfDictionary ap = widget.getAsDict(PdfName.AP);
    PdfStream normal = ap.getAsStream(PdfName.N);
    PdfDictionary resources = normal.getAsDict(PdfName.RESOURCES);
    PdfDictionary xobject = resources.getAsDict(PdfName.XOBJECT);
    PdfStream frm = xobject.getAsStream(PdfName.FRM);
    PdfDictionary res = frm.getAsDict(PdfName.RESOURCES);
    PdfDictionary xobj = res.getAsDict(PdfName.XOBJECT);
    PRStream n2 = (PRStream) xobj.getAsStream(PdfName.N2);
    byte[] stream = PdfReader.getStreamBytes(n2);
    System.out.println(new String(stream));
}

We get the widget annotation for the signature field with name "signature1". Based on the info from RUPS, we know that we have to get the resources (/Resources) of the normal (/N) appearance (/AP). In the /XObjects dictionary, we'll find a form XObject named /FRM. This XObject has in turn also some /Resources, more specifically two /XObjects, one named /n0, the other one named /n2.

We get the stream of the /n2 object and we convert it to an uncompressed byte[]. When we print this array as a String, we get the following result:

BT
1 0 0 1 0 49.55 Tm
/F1 12 Tf
(This document was signed by Bruno)Tj
1 0 0 1 0 31.55 Tm
(Specimen.)Tj
ET

This is PDF syntax. BT and ET stand for "Begin Text" and "End Text". The Tm operator set the text matrix. The Tf operator set the font. Tj shows the strings that are delimited by ( and ). If you want the plain text, it's sufficient to extract only the text that is between parentheses.

来源：https://stackoverflow.com/questions/30305126/get-layer2-text-signature-description-from-signature-image-using-itextsharp

标签

itextsharp