How to Insert an Pdpage Within Another Pdpage with PDFbox

How to insert an PDPage within another PDPage with pdfbox

Is it possible to use pdfbox to include other pdf pages as embedded objects within a page

It should be possible. The PDF format allows the use of so called form xobjects to serve as such embedded objects. I don't see an explicit implementation for that, though, but the procedure is similar enough to what PageExtractor or PDFMergerUtility do.

A proof of concept derived from PageExtractor using the current SNAPSHOT of the PDFBox 2.0.0 development version:

PDDocument source = PDDocument.loadNonSeq(SOURCE, null);
List<PDPage> pages = source.getDocumentCatalog().getAllPages();

PDDocument target = new PDDocument();
PDPage page = new PDPage();
PDRectangle cropBox = page.findCropBox();
page.setResources(new PDResources());
target.addPage(page);

PDFormXObject xobject = importAsXObject(target, pages.get(0));
page.getResources().addXObject(xobject, "X");

PDPageContentStream content = new PDPageContentStream(target, page);
AffineTransform transform = new AffineTransform(0, 0.5, -0.5, 0, cropBox.getWidth(), 0);
content.drawXObject(xobject, transform);
transform = new AffineTransform(0.5, 0.5, -0.5, 0.5, 0.5 * cropBox.getWidth(), 0.2 * cropBox.getHeight());
content.drawXObject(xobject, transform);
content.close();

target.save(TARGET);
target.close();
source.close();

This code imports the first page of a source document to a target document as XObject and puts it twice onto a page there with different scaling and rotation transformations, e.g. for this source

Source PDF, page 1

it creates this

Target PDF

The helper method importAsXObject actually doing the import is defined like this:

PDFormXObject importAsXObject(PDDocument target, PDPage page) throws IOException
{
final PDStream src = page.getContents();
if (src != null)
{
final PDFormXObject xobject = new PDFormXObject(target);

OutputStream os = xobject.getPDStream().createOutputStream();
InputStream is = src.createInputStream();
try
{
IOUtils.copy(is, os);
}
finally
{
IOUtils.closeQuietly(is);
IOUtils.closeQuietly(os);
}

xobject.setResources(page.findResources());
xobject.setBBox(page.findCropBox());

return xobject;
}
return null;
}

As mentioned above this is only a proof of concept, corner cases have not yet been taken into account.

Add page as layer from separate pdf(different page size) using pdfbox

PDFBox dependencies

The main issue was that PDFBox has three core components and one required dependency. One core component was missing.

In comments the OP clarified that

Actually process doesn't hangs, the file is just not created at all.

As this sounds like there might have been an exception or error, trying to envelope the code as a try { ... } catch (Throwable t) { t.printStackTrace(); } block has been proposed in chat. And indeed,

java.lang.NoClassDefFoundError: org/apache/fontbox/util/BoundingBox 
at org.apache.pdfbox.util.LayerUtility.importPageAsForm(LayerUtility.java:203)
at org.apache.pdfbox.util.LayerUtility.importPageAsForm(LayerUtility.java:135)
at ...

As it turned out, fontbox.jar was missing from the OP's setup.

The PDFBox version 1.8.x dependencies are described here. Especially there are the three core components pdfbox, fontbox, and jempbox all of which shall be present in the same version, and there is the required dependency commons-logging.

As soon as the missing component had been added, the sample worked properly.

Positioning the imported page

The imported page can be positioned on the target page by means of a translation in the AffineTransform parameter. This parameter also allows for other transformations, e.g. to scale, rotate, mirror, skew,...*

For the original sample files this PDF page

Source page from test-pdf.pdf

was added onto onto this page

Sample Image

which resulted in this page

result of the OP's original code

The OP then wondered

how to position the imported layer

The parameter for that in the layerUtility.appendFormAsLayer call is the AffineTransform affineTransform. The OP used new AffineTransform() here which creates an identity matrix which in turn causes the source page to be added at the origin of coordinate system, in this case at the bottom.

By using a translation instead of the identity, e.g

PDRectangle destCrop = destinationPages.get(0).findCropBox();
PDRectangle sourceBox = firstForm.getBBox();
AffineTransform affineTransform = AffineTransform.getTranslateInstance(0, destCrop.getUpperRightY() - sourceBox.getHeight());

one can position the source page elsewhere, e.g. at the top:

result using the translation above

PDFBox LayerUtility's expectations

Unfortunately it turns out that layerUtility.appendFormAsLayer appends the form to the page without resetting the graphics context.

layerUtility.appendFormAsLayer uses this code to add an additional content stream:

PDPageContentStream contentStream = new PDPageContentStream(
targetDoc, targetPage, true, !DEBUG);

Unfortunately a content stream generated by this constructor inherits the graphics state as is at the end of the existing content of the target page. This especially means that the user space coordinate system may not be in its default state anymore. Some software e.g. mirrors the coordinate system to have y coordinates increasing downwards.

If instead

PDPageContentStream contentStream = new PDPageContentStream(
targetDoc, targetPage, true, !DEBUG, true);

had been used, the graphics state would have been reset to its default state and, therefore, be known.

By itself, therefore, this method is not usable in a controlled manner for arbitrary input.

Fortunately, though, the LayerUtility also has a method wrapInSaveRestore(PDPage) to overcome this weakness by manipulating the content of the given page to have the default graphics state at the end.

Thus, one should replace

layerUtility.appendFormAsLayer(destinationPages.get(0), firstForm, affineTransform, "external page");

by

PDPage destPage = destinationPages.get(0);
layerUtility.wrapInSaveRestore(destPage);
layerUtility.appendFormAsLayer(destPage, firstForm, affineTransform, "external page");

Create PDPage with right format using PDFBox

You can do it like this:

PDRectangle A4L = new PDRectangle(PDRectangle.A4.getHeight(), PDRectangle.A4.getWidth());

Adding an image to a pdf via pdfbox but after adding the pdf get blank

As MKL & Tilman have written in their comment you have to use

PDPageContentStream(document, page, AppendMode.APPEND, true, true);

public static void addImageToPDF() throws IOException {
File file = new File("C:\\Users\\user\\Downloads\\Risk Template(RiskTemplate).pdf");
PDDocument doc = PDDocument.load(file);
PDPage page = doc.getPage(0);
PDImageXObject pdImage = PDImageXObject.createFromFile("D:\\Development\\Workspace\\1\\image.png", doc);
PDPageContentStream contents = new PDPageContentStream(doc, page, AppendMode.APPEND, true, true);
contents.drawImage(pdImage, 5, 5);
System.out.println("Image inserted");
contents.close();
doc.save("D:\\Development\\Workspace\\1\\InsertImage_OP.pdf");

doc.close();
}

Disclaimer: This answer was given by MKL/Tilman but for further reference an answer is more "visible" than a comment.

What is the correct way to deep clone PDPage?

I got the solution!

1) Start with an empty pdf template, only has 1 page. Open template document, fill common data and save as byte[] in memory.


PDDocument templatedoc = PDDocument.load(new File(path));
PDDocumentCatalog catalog = templatedoc.getDocumentCatalog();

PDAcroFrom acroForm = catalog.getAcroForm());

... fill acroForm common data of all pages ...

ByteArrayOutputStream basicTemplate = new ByteArrayOutputStream();
templatedoc.save(basicTemplate);

byte[] filledBasicTemplate = basicTemplate.toByteArray();

2) Generate new document for each needed page.


List<PDDocument> documents = new ArrayList<PDDocument>();
PDDocument activeDoc;

for(int i = 0; i < 5; i++) {
activeDoc = PDDocument.load(filledBasicTemplate);
documents.add(activeDoc);

... fill acroform or you need in each page ...

}

3) Import all new document first pages into final document and save final document.


PDDocument finalDoc = new PDDocument();

for(PDDocument currentDoc : documents) {
... fill common things like page numbers ...
finalDoc.importPage(currentDoc.getPage(0));
}

finalDoc.save(new File(path));

... close all documents after save the final document ...

It maybe not be the most optimized code, but it works.



Related Topics



Leave a reply



Submit