How to Join Multiple PDF Pages to a Single Page

How to join multiple PDF pages to a single Page

there are several ways to perform this task, one easier, one harder



The EASIER: A MULTIVALENT.JAR WAY

Multivalent.jar is a stunning piece of free software able to perform many useful tasks on pdf

you can download from one of these links (the 2009 multivalent.jar build available on sourceforge has no more pdf tools inside)

  • https://rg.to/file/c6bd7f31bf8885bcaa69b50ffab7e355

  • you need to know the width and height of your pdf (in Linux you can use pdfinfo)



  • assuming your multipage pdf is in ISO A4 size (21x29.7cm), type:


java -cp path..to/Multivalent.jar tool.pdf.Impose -dim 4x1 -paper
84x29.7cm input.pdf


this is the resulting page, composed by the 4 sequential pages stitched side by side together:

4_pdf_pages_appended_side_by_side

  • resulting pdf file
    http://ge.tt/98Kv4ce/v/0

explication:

-dim 4x1 means number of columns for rows

-paper 84x29.7cm means paper size of your final imposed document containing the 4 pages joined side by side. obviously, since
in your final pdf file, you will have 4 columns and only one row, you
need to multiply by 4 the document witdh (21 cm)

multivalent can accept, as unity input, also inches (-paper
33.4x11.68in
) or postscript points (-paper 2380x841pt)





THE HARDER: A LATEX WAY:

4_pdf_pages_appended_side_by_side

some years ago, Peter Flynn, in comp.text.pdf suggested, for a similar task, a way to appending pdf pages side by side with the only help of LateX. If you are a LaTeXian, you can act as follows:

since you need to append side by side the four pages of your single multipage pdf, you will write a latex preamble, creating a new document like this:

assuming your pdf document has name input.pdf and its size is ISO A4, and you have this multipage pdf in your working folder, you will have

\documentclass[a4paper]{article}
\usepackage[margin=0mm,nohead,nofoot]{geometry}
\usepackage{pdfpages}
\pagestyle{empty}
\parindent0pt
\begin{document}
\includepdfmerge[nup=1x4,landscape]{input.pdf,1,input.pdf,2,input.pdf,3,input.pdf,4}
\end{document}

Merge / convert multiple PDF files into one PDF

I'm sorry, I managed to find the answer myself using google and a bit of luck : )

For those interested;

I installed the pdftk (pdf toolkit) on our debian server, and using the following command I achieved desired output:

pdftk file1.pdf file2.pdf cat output output.pdf

OR

gs -q -sPAPERSIZE=letter -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=output.pdf file1.pdf file2.pdf file3.pdf ...

This in turn can be piped directly into pdf2ps.

Join multiple PDF files into a single page PDF - positioned join using PHP

I've done this in the past with a combination of Ghostscript (to position an existing PDF page on a larger empty PDF page) and pdftk (to overlay/merge two equally-sized PDF pages into a new one).

Have a look at these answers (@Superuser.com) to get an idea how this works:

  • Convert PDF 2 sides per page to 1 side per page
  • Using Ghostscript to convert multi-page PDF into single JPG?
  • Freeware to split a pdf's pages down the middle?

My procedure uses commandline and/or scripts. However, this could also be extended to do it programmatically from PHP with the Ghostscript .dll/.so

Merge PDF pages to 1 file without generating single page files

You need to use BytesIO:

for fileset in filesets:
merger = PdfFileMerger()
page_path = fr".\output\pages"
for file in fileset:
# Load image, read with pytesseract
path = os.path.join(download_location,file)
img = cv2.imread(path,1)
result = pytesseract.image_to_pdf_or_hocr(img, lang="eng",config=tessdata_dir_config)
merger.append(BytesIO(result))

merger.write(fr".\output\{FILE}.pdf")

How to Merge two pages from a pdf file as one page

The library pyPDF2 has also a PdfFileMerger object, that should do exactly what you want.

As from the example here you can just create a PdfFileMerger, read two pages and put them into one single file.

I changed your script slightly to create also files with pages 0-1, 2-3, 4-5 ecc.. (of course page 0 is the first page but python numbering starts from 0)

import os
from PyPDF2 import PdfFileReader, PdfFileWriter, PdfFileMerger

def pdf_splitter(path):

fname = os.path.splitext(os.path.basename(path))[0]

pdf = PdfFileReader(path)
input_paths = []
for page in range(pdf.getNumPages()):
pdf_writer = PdfFileWriter()
pdf_writer.addPage(pdf.getPage(page))
output_filename = '{}_page_{}.pdf'.format(fname, page+1)
input_paths.append(output_filename)
with open(output_filename, 'wb') as out:
pdf_writer.write(out)

print('Created: {}'.format(output_filename))

# every 2 pages!
# Change the two if you need every other number of pages!
if page % 2 == 1:
pdf_merger = PdfFileMerger() #create pdfilemerger
for path in input_paths:
pdf_merger.append(path) #read the single pages

# we call it pages_N-1_N, so first would be pages_0_1!
output_path = '{}_pages_{}_{}.pdf'.format(fname, page-1, page)
with open(output_path, 'wb') as fileobj:
pdf_merger.write(fileobj) # write the two pages pdf!

input_paths = []

if __name__ == '__main__':

path = 'D:\Tasks\Samples\fw9.pdf'
pdf_splitter(path)

Is this what you wanted?

This will first create single pdf for each page and then combine them 2 to 2. Creating the single pdf could also be skipped, but I was not sure whether you want it or not.

Concatenating multiple page pdf into single page pdf

You can create a new page object thats twice as long as the first(assuming both pages are of equal height) and put the pages one after other in the new page.

from PyPDF2 import PdfFileReader, PdfFileWriter
from PyPDF2.pdf import PageObject

reader = PdfFileReader(open("file.pdf",'rb'))

page_1 = reader.getPage(0)
page_2 = reader.getPage(1)

#Creating a new file double the size of the original
translated_page = PageObject.createBlankPage(None, page_1.mediaBox.getWidth(), page_1.mediaBox.getHeight()*2)

#Adding the pages to the new empty page
translated_page.mergeScaledTranslatedPage(page_1, 1, 0, page_1.mediaBox.getHeight())
translated_page.mergePage(page_2)

writer = PdfFileWriter()
writer.addPage(translated_page)

with open('out.pdf', 'wb') as f:
writer.write(f)

If they are of different heights, just do

translated_page = PageObject.createBlankPage(None, page_1.mediaBox.getWidth(), page_1.mediaBox.getHeight()+ page_2.mediaBox.getHeight())

How to stitch two PDF pages together as one big page?

Since the OP didn't provide (a link to) the original input posters, this answer will proceed in three steps:

  1. Create 2 dummy posters as input for step 3
  2. Create a LaTeX document which embeds the 2 dummy posters
  3. Run pdflatex to create a PDF from the LaTeX document in step 2

Step 1: Create 2 dummy posters (with size of 36in x 48in)

I've created two different dummy posters as PDF to show you how to do it with LaTeX. (That implies: you need to have at least a basic LaTeX installation on your system, including the pdflatex utility.)

These two dummies I created with the help of Ghostscript. Since for Ghostscript's pdfwrite device 1in == 72pt == 720pixels, the commands are like this (because 36in == 2592pt == 25920pixels and 48in == 3456pt == 34560pixels):

gs -o poster1.pdf                   \
-g25920x34560 \
-sDEVICE=pdfwrite \
-c " /Helvetica-Bold findfont" \
-c " 500 scalefont" \
-c " setfont" \
-c " 50 2000 moveto" \
-c " (POSTER 1) show" \
-c " 1 0 0 setrgbcolor" \
-c " 10 setlinewidth" \
-c " 20 20 2552 3416 rectstroke" \
-c " showpage"

gs -o poster2.pdf \
-g25920x34560 \
-sDEVICE=pdfwrite \
-c " /Helvetica-Bold findfont" \
-c " 600 scalefont" \
-c " setfont" \
-c " 50 2000 moveto" \
-c " (Poster 1) show" \
-c " 1 0 0 setrgbcolor" \
-c " 10 setlinewidth" \
-c " 20 20 2552 3416 rectstroke" \
-c " showpage"

Here are 2 screenshots showing these "posters":


Step 2: Create a little LaTeX program to be run with pdflatex

There is a LaTeX package called 'pdfpages' which can insert PDF pages into LaTeX documents, but which also can create "n-up" layouts from PDF pages. (In addition to your basic LaTeX installation you need that package too.)

So here is a small LaTeX program you can use. Save it as 2up-poster.tex:

\documentclass{article}
\usepackage{pdfpages}
\usepackage[paperwidth=72in, paperheight=48in]{geometry}
\pagestyle{plain} % Don't use page numbers

\begin{document}
\setlength\voffset{+0.0in} % adj. vert. offset as needed
\setlength\hoffset{+0.0in} % adj. horiz. offset as needed
\includepdfmerge[nup=2x1,
noautoscale=true, % set "false" if larger inputs
frame=false, % set "true" for frames
templatesize={36in}{48in}] % adjust as needed
{poster1.pdf,poster2.pdf} % modify for file names
\end{document}

Step 3: Run pdflatex

Now you can run the following command to create your composed poster:

pdflatex 2up-poster.tex

This will create a PDF file named 2up-poster.pdf.

The result is here as a screenshot:

Merge PDF files

Use Pypdf or its successor PyPDF2:

A Pure-Python library built as a PDF toolkit. It is capable of:

  • splitting documents page by page,
  • merging documents page by page,

(and much more)

Here's a sample program that works with both versions.

#!/usr/bin/env python
import sys
try:
from PyPDF2 import PdfFileReader, PdfFileWriter
except ImportError:
from pyPdf import PdfFileReader, PdfFileWriter

def pdf_cat(input_files, output_stream):
input_streams = []
try:
# First open all the files, then produce the output file, and
# finally close the input files. This is necessary because
# the data isn't read from the input files until the write
# operation. Thanks to
# https://stackoverflow.com/questions/6773631/problem-with-closing-python-pypdf-writing-getting-a-valueerror-i-o-operation/6773733#6773733
for input_file in input_files:
input_streams.append(open(input_file, 'rb'))
writer = PdfFileWriter()
for reader in map(PdfFileReader, input_streams):
for n in range(reader.getNumPages()):
writer.addPage(reader.getPage(n))
writer.write(output_stream)
finally:
for f in input_streams:
f.close()
output_stream.close()

if __name__ == '__main__':
if sys.platform == "win32":
import os, msvcrt
msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)
pdf_cat(sys.argv[1:], sys.stdout)


Related Topics



Leave a reply



Submit