问题
Is there any free library, that can be used to get resolution of images in DPI contained by PDF file?
I've tried the following code, using PDFSharp but the DPI it returns is not correct. For example it shows 96dpi while it should be 150dpi:
using (PdfDocument pdf = PdfReader.Open(sourcePdf))
{
for (int i = 0; i < pdf.Pages.Count; i++)
{
XGraphics xGraphics = XGraphics.FromPdfPage(pdf.Pages[i]);
float dpi = xGraphics.Graphics.DpiX;
}
}
回答1:
You can use a command line tool to get the info you need: pdfimages
.
However, you need a recent version pdfimages
that is based on the Poppler library (NOT the 'pdfimages' that is based on XPDF!)
Recent Poppler versions let you use the -list
option:
pdfimages -list -f 2 -l 4 my.pdf
The output of above example command shows all images in the page range from 2 (f irst page to show) to 4 (l ast page to show).
Here is the output for the above command, using an example PDF file I prepared specifically for this question (scroll horizontally to see all columns):
page num type width height color comp bpc enc interp object ID x-ppi y-ppi size ratio
---------------------------------------------------------------------------------------
2 0 image 697 1238 gray 1 8 jpeg no 16 0 320 320 142K 17%
3 1 image 697 1238 gray 1 8 jpeg no 16 0 151 151 142K 17%
4 2 image 697 1238 gray 1 8 jpeg no 16 0 84 115 142K 17%
The output shows the following:
There are three images on the three pages 2-4 (as indicated by columns 1+2, headed
page
andnum
).The PDF object IDs for all three images are identical:
16 0
(as indicated by columns 11+12, headedobject
+ID
). This means the PDF has only one distinct object defined, but showing it three times (i.e., the image is embedded only once, but appears on 3 pages).The image's width is
697
pixels, its height is1238
pixels, its image depth (bits per color) is8
, its colorspace isgray
its number of color channels/components is1
, its compression scheme isjpeg
, its bytesize (as embedded) is142K
, its compression rate is17%
(as indicated by columns 4-9 and 14+15 headedwidth
,height
,color
,comp
,bpc
,size
andratio
).However, the same image appears on different pages in different resolutions (given as PPI -- pixels per inch --- not DPI):
page 2 shows it with a PPI of
320
in both directions,page 4 shows it with a PPI of
151
in both directions,while page 3 shows it with a PPI of
84
in horizontal (X) direction and115
PPI in vertical (Y) direction.
Now, if a command line tool cannot be re-purposed for your goal: the Poppler library which is the base for the tool shown above certainly is Free ('free as in liberty', as well as 'free as in beer').
Here is a link to the PDF ("my.pdf") I used to demonstrate the output of the command above.
回答2:
PDF's do not necessarily use DPI in their definitions. PDF's allow the document creator to define their own user coordinate space which may or may not map to anything similar to Dots Per Inch.
From here:
来源:https://stackoverflow.com/questions/27938551/how-to-check-pdf-pages-for-resolution-dpi-of-embedded-images