Many hours have I searched for a fast and easy, but mostly a
Here is a R
function that reports the PDF file page number by using the pdfinfo
command.
pdf.file.page.number <- function(fname) {
a <- pipe(paste("pdfinfo", fname, "| grep Pages | cut -d: -f2"))
page.number <- as.numeric(readLines(a))
close(a)
page.number
}
if (F) {
pdf.file.page.number("a.pdf")
}
Simplest of all is using ImageMagick
here is a sample code
$image = new Imagick();
$image->pingImage('myPdfFile.pdf');
echo $image->getNumberImages();
otherwise you can also use PDF
libraries like MPDF
or TCPDF
for PHP
Since you're ok with using command line utilities, you can use cpdf (Microsoft Windows/Linux/Mac OS X). To obtain the number of pages in one PDF:
cpdf.exe -pages "my file.pdf"
Here is a simple example to get the number of pages in PDF with PHP.
<?php
function count_pdf_pages($pdfname) {
$pdftext = file_get_contents($pdfname);
$num = preg_match_all("/\/Page\W/", $pdftext, $dummy);
return $num;
}
$pdfname = 'example.pdf'; // Put your PDF path
$pages = count_pdf_pages($pdfname);
echo $pages;
?>
This seems to work pretty well, without the need for special packages or parsing command output.
<?php
$target_pdf = "multi-page-test.pdf";
$cmd = sprintf("identify %s", $target_pdf);
exec($cmd, $output);
$pages = count($output);
It is downloadable for Linux and Windows. You download a compressed file containing several little PDF-related programs. Extract it somewhere.
One of those files is pdfinfo (or pdfinfo.exe for Windows). An example of data returned by running it on a PDF document:
Title: test1.pdf
Author: John Smith
Creator: PScript5.dll Version 5.2.2
Producer: Acrobat Distiller 9.2.0 (Windows)
CreationDate: 01/09/13 19:46:57
ModDate: 01/09/13 19:46:57
Tagged: yes
Form: none
Pages: 13 <-- This is what we need
Encrypted: no
Page size: 2384 x 3370 pts (A0)
File size: 17569259 bytes
Optimized: yes
PDF version: 1.6
I haven't seen a PDF document where it returned a false pagecount (yet). It is also really fast, even with big documents of 200+ MB the response time is a just a few seconds or less.
There is an easy way of extracting the pagecount from the output, here in PHP:
// Make a function for convenience
function getPDFPages($document)
{
$cmd = "/path/to/pdfinfo"; // Linux
$cmd = "C:\\path\\to\\pdfinfo.exe"; // Windows
// Parse entire output
// Surround with double quotes if file name has spaces
exec("$cmd \"$document\"", $output);
// Iterate through lines
$pagecount = 0;
foreach($output as $op)
{
// Extract the number
if(preg_match("/Pages:\s*(\d+)/i", $op, $matches) === 1)
{
$pagecount = intval($matches[1]);
break;
}
}
return $pagecount;
}
// Use the function
echo getPDFPages("test 1.pdf"); // Output: 13
Of course this command line tool can be used in other languages that can parse output from an external program, but I use it in PHP.
I know its not pure PHP, but external programs are way better in PDF handling (as seen in the question).
I hope this can help people, because I have spent a whole lot of time trying to find the solution to this and I have seen a lot of questions about PDF pagecount in which I didn't find the answer I was looking for. That's why I made this question and answered it myself.