Determine the Number of Pages in a Pdf File

Get the number of pages in a PDF document

A simple command line executable called: pdfinfo.

It is downloadable for Linux and Windows. You download a compressed file containing several little PDF-related programs. Extract it somewhere.

One of those files is pdfinfo (or pdfinfo.exe for Windows). An example of data returned by running it on a PDF document:

Title:          test1.pdf
Author: John Smith
Creator: PScript5.dll Version 5.2.2
Producer: Acrobat Distiller 9.2.0 (Windows)
CreationDate: 01/09/13 19:46:57
ModDate: 01/09/13 19:46:57
Tagged: yes
Form: none
Pages: 13 <-- This is what we need
Encrypted: no
Page size: 2384 x 3370 pts (A0)
File size: 17569259 bytes
Optimized: yes
PDF version: 1.6

I haven't seen a PDF document where it returned a false pagecount (yet). It is also really fast, even with big documents of 200+ MB the response time is a just a few seconds or less.

There is an easy way of extracting the pagecount from the output, here in PHP:

// Make a function for convenience 
function getPDFPages($document)
{
$cmd = "/path/to/pdfinfo"; // Linux
$cmd = "C:\\path\\to\\pdfinfo.exe"; // Windows

// Parse entire output
// Surround with double quotes if file name has spaces
exec("$cmd \"$document\"", $output);

// Iterate through lines
$pagecount = 0;
foreach($output as $op)
{
// Extract the number
if(preg_match("/Pages:\s*(\d+)/i", $op, $matches) === 1)
{
$pagecount = intval($matches[1]);
break;
}
}

return $pagecount;
}

// Use the function
echo getPDFPages("test 1.pdf"); // Output: 13

Of course this command line tool can be used in other languages that can parse output from an external program, but I use it in PHP.

I know its not pure PHP, but external programs are way better in PDF handling (as seen in the question).

I hope this can help people, because I have spent a whole lot of time trying to find the solution to this and I have seen a lot of questions about PDF pagecount in which I didn't find the answer I was looking for. That's why I made this question and answered it myself.

Security Notice: Use escapeshellarg on $document if document name is being fed from user input or file uploads.

Determine the number of pages in a PDF file

You can use Apache PDFBox to load a PDF document and then call the getNumberOfPages method to return the page count.

PDDocument doc = PDDocument.load(new File("file.pdf"));
int count = doc.getNumberOfPages();

Determine number of pages in a PDF file

You'll need a PDF API for C#. iTextSharp is one possible API, though better ones might exist.

iTextSharp Example

You must install iTextSharp.dll as a reference. Download iTextsharp from SourceForge.net This is a complete working program using a console application.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using iTextSharp.text.pdf;
using iTextSharp.text.xml;
namespace GetPages_PDF
{
class Program
{
static void Main(string[] args)
{
// Right side of equation is location of YOUR pdf file
string ppath = "C:\\aworking\\Hawkins.pdf";
PdfReader pdfReader = new PdfReader(ppath);
int numberOfPages = pdfReader.NumberOfPages;
Console.WriteLine(numberOfPages);
Console.ReadLine();
}
}
}

Count the number of pages in a PDF in only PHP

You can use the ImageMagick extension for PHP. ImageMagick understands PDF's, and you can use the identify command to extract the number of pages. The PHP function is Imagick::identifyImage().

Count total number of pages in pdf file

function menuItem() {
var folder =
DriveApp.getFoldersByName('Test').next();
var contents = folder.searchFiles('title contains ".PDF"');
var file;
var name;
var sheet = SpreadsheetApp.getActiveSheet();
var count;

sheet.clear();
sheet.appendRow(["Name", "Number of pages"]);

while(contents.hasNext()) {
file = contents.next();
name = file.getName();
count =
file.getBlob().getDataAsString().split("/Contents").length - 1;

data = [name, count]
sheet.appendRow(data);
}
};


function onOpen() {
var ui = SpreadsheetApp.getUi();
ui.createMenu('PDF Page Calculator')
.addItem("PDF Page Calculator",
'menuItem')
.addToUi();
};

Total number of pages in a PDF document

I'm not aware of a way that ould let you do this. But you can use try/catch to handle the situation directly without knowing the number of pages beforehand.

If you do need to know the number of pages beforehand you could just iterate through the pages until you hit an error that you do handle using try/catch (works for small pdfs) or implement e.g. a binary search in a similar way.

Find the number of pages in a PDF using PHP

Simplest of all is using ImageMagick
here is a sample code

 $image = new Imagick();
$image->pingImage('myPdfFile.pdf');
echo $image->getNumberImages();

otherwise you can also use PDF libraries like MPDF or TCPDF for PHP

How to get the number of pages of a .PDF uploaded by user?

In case you use pdf.js you may reference an example on github ('.../examples/node/getinfo.js') with following code that prints number of pages in a pdf file.

const pdfjsLib = require('pdfjs-dist');
...
pdfjsLib.getDocument(pdfPath).then(function (doc) {
var numPages = doc.numPages;
console.log('# Document Loaded');
console.log('Number of Pages: ' + numPages);
})

How to get the number of pages inside each pdf document inside a directory

Question like has been already asked Stack Overflow here.
Hope it helps.

EDIT:

This is how you can find the number of pages in each pdf file present in your specified directory:

using System;
using iTextSharp.text.pdf;
using System.IO;

namespace ConsoleApplication2
{
class Program
{
static void Main(string[] args)
{
int PgCount = 0;
string[] PdfFiles = Directory.GetFiles(@"C:\MyFolder\", "*.pdf");
Console.WriteLine("{0} PDF Files in directory", PdfFiles.Length.ToString());
for (int i = 0; i < PdfFiles.Length; i++)
{
PgCount = GetNumberOfPages(PdfFiles[i]);
Console.WriteLine("{0} File has {1} pages", PdfFiles[i], PgCount.ToString());
}
Console.ReadLine();
}

static int GetNumberOfPages(String FilePath)
{
PdfReader pdfReader = new PdfReader(FilePath);
return pdfReader.NumberOfPages;
}
}
}

You will have to download itextsharp.dll from here and include that in References.



Related Topics



Leave a reply



Submit