DPI of image extracted from PDF with pdfBox

China☆狼群 提交于 2019-11-28 09:14:24

问题


I'm using java pdfBox library to validate single page pdf files with embedded images.

I know that pdf file itself doesen't contain the DPI information.

However the images that have the equal dimensions in the document have different sizes in pixels after extracting and no dpi meta information.

So is it possible to somehow calculate the image sizes relative to pdf page or to extract images with their dpi information (for png or jpeg image files) using pdfBox?

Thanks!


回答1:


Get the PrintImageLocations.java file from the PDFBOX src download. Here's an except of the source, only the last line is by me, and it will output the dpi:

            float imageXScale = ctmNew.getXScale();
            float imageYScale = ctmNew.getYScale();
            System.out.println("position = " + ctmNew.getXPosition() + ", " + ctmNew.getYPosition());
            // size in pixel
            System.out.println("size = " + imageWidth + "px, " + imageHeight + "px");
            // size in page units
            System.out.println("size = " + imageXScale + "pu, " + imageYScale + "pu");
            // size in inches 
            imageXScale /= 72;
            imageYScale /= 72;
            System.out.println("size = " + imageXScale + "in, " + imageYScale + "in");
            // size in millimeter
            imageXScale *= 25.4;
            imageYScale *= 25.4;
            System.out.println("size = " + imageXScale + "mm, " + imageYScale + "mm");

            System.out.printf("dpi  = %.0f dpi (X), %.0f dpi (Y) %n", image.getWidth() * 72 / ctmNew.getXScale(), image.getHeight() * 72 / ctmNew.getYScale());

And here's a sample output:

Found image [X0]

position = 0.0, 0.0

size = 2544px, 3523px <---- pixels

size = 610.56pu, 845.52pu <---- "page units", 1pu = 1/72 inch

size = 8.48in, 11.743334in

size = 215.39198mm, 298.28067mm

dpi = 300 dpi (X), 300 dpi (Y)




回答2:


I am not familiar with pdfBox, but you has a CTM associated with every raster image in pdf. CTM gives one data about position and dimensions of image on the page. Thus and data from extracted images should be sufficient to calculate relative dpi.



来源:https://stackoverflow.com/questions/5472711/dpi-of-image-extracted-from-pdf-with-pdfbox

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!