Getting Text Colour with PDFBox

倾然丶 夕夏残阳落幕 提交于 2019-11-29 12:56:13

All color informations should be stored in the class PDGraphicsState and the used color (stroking/nonstroking etc.) depends on the used text rendering mode (via pdfbox mailing list).

Here is a small sample I tried:

After creating a pdf with just one line ("Sample" written in RGB=[146,208,80]), the following program will output:

DeviceRGB 146.115 208.08 80.07

Here's the code:

PDDocument doc = null;
try {
    doc = PDDocument.load("C:/Path/To/Pdf/Sample.pdf");
    PDFStreamEngine engine = new PDFStreamEngine(ResourceLoader.loadProperties("org/apache/pdfbox/resources/PageDrawer.properties"));
    PDPage page = (PDPage)doc.getDocumentCatalog().getAllPages().get(0);
    engine.processStream(page, page.findResources(), page.getContents().getStream());
    PDGraphicsState graphicState = engine.getGraphicsState();
    System.out.println(graphicState.getStrokingColor().getColorSpace().getName());
    float colorSpaceValues[] = graphicState.getStrokingColor().getColorSpaceValue();
    for (float c : colorSpaceValues) {
        System.out.println(c * 255);
    }
}
finally {
    if (doc != null) {
        doc.close();
    }

Take a look at PageDrawer.properties to see how PDF operators are mapped to Java classes.

As I understand it, as PDFStreamEngine processes a page stream, it sets various variable states depending on what operators it is processing at the moment. So when it hits green text, it will change the PDGraphicsState because it will encounter appropriate operators. So for CS it calls org.apache.pdfbox.util.operator.SetStrokingColorSpace as defined by mapping CS=org.apache.pdfbox.util.operator.SetStrokingColorSpace in the .properties file. RG is mapped to org.apache.pdfbox.util.operator.SetStrokingRGBColor and so on.

In this case, the PDGraphicsState hasn't changed because the document has just text and the text it has is in just one style. For something more advanced, you would need to extend PDFStreamEngine (just like PageDrawer, PDFTextStripper and other classes do) to do something when color changes. You could also write your own mappings in your own .properties file.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!