Tc, Tw and Tz operators with PDFBox

喜欢而已 提交于 2019-12-12 18:25:49

问题


I tried to read an existing PDF document through PDFBox, extract the Tj operator and then change the spacing between words (Tw), characters (Tc), the horizontal spacing (Tz) in order to generate the modified document. My problem is when i edit the modified document to read the modified file structure, the values of Tc, Tw, Tz operators are changed. What is the solution to prevent this change?

let us consider this code:

public static void main(String[] args) throws IOException, COSVisitorException {
        // TODO code application logic here   
           tes= new Test1();
           tes.CreatePdf(src);
           PDDocument doc ;
           doc = PDDocument.load("doc.pdf");
           List pages = doc.getDocumentCatalog().getAllPages();  
           for (int i = 0; i < pages.size(); i++)  {
              PDPage page = (PDPage) pages.get(i);  
              PDStream contents = page.getContents();  
           COSDictionary dic= page.getCOSDictionary();
          System.out.println (dic.getCOSObject());
              PDFStreamParser parser = new PDFStreamParser(contents.getStream());
              parser.parse();  
              List tokens = parser.getTokens();  
              System.out.println(tokens);
                for (int j = 0; j < tokens.size(); j++)  
            {  
                  Object next = tokens.get(j); 
                     if (next instanceof PDFOperator)  {
                       PDFOperator op = (PDFOperator) next;  
                    // Tj and TJ are the two operators that display strings in a PDF  
                             if (op.getOperation().equals("Tj"))  
                    { 


            tokens.set(j-1, COSFloat.get("0.00416145"));
            tokens.set(j, PDFOperator.getOperator("Tc"));
            tokens.add(++j, new COSString("he"));
            tokens.add(++j, PDFOperator.getOperator("Tj"));
             tokens.add(++j, COSFloat.get("0.001611215"));
            tokens.add(++j, PDFOperator.getOperator("Tc"));
            tokens.add(++j, COSFloat.get("0.0067152"));
            tokens.add(++j, PDFOperator.getOperator("Tw"));
             tokens.add(++j, new COSString("llo w"));
             tokens.add(++j, PDFOperator.getOperator("Tj"));
             tokens.add(++j, COSFloat.get("100.001410144"));
             tokens.add(++j, PDFOperator.getOperator("Tz"));
            tokens.add(++j, new COSString("orld"));
            tokens.add(++j, PDFOperator.getOperator("Tj"));


                    }
                 }      
            }
                // now that the tokens are updated we will replace the page content stream.
            PDStream updatedStream = new PDStream(doc);  
            OutputStream out = updatedStream.createOutputStream();  
            ContentStreamWriter tokenWriter = new ContentStreamWriter(out);  
            tokenWriter.writeTokens(tokens);  
            page.setContents(updatedStream);

    }

      doc.save("a.pdf"); 
      doc.close();  
    }

This file structure is obtained as follow:
6 0 obj
<<
/Length 8 0 R
>>
stream
BT
 /F0 12 Tf
 15 385 Td
 0.0041614501 Tc
 (he) Tj
 0.0016112149 Tc
 0.0067151999 Tw
 (llo w) Tj
 100.001411438 Tz
 (orld) Tj
 ET

endstream
endobj

Best regards,

来源:https://stackoverflow.com/questions/18032723/tc-tw-and-tz-operators-with-pdfbox

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!