How to Create PDF Files in Python

How to create PDF files in Python

I suggest pyPdf. It works really nice. I also wrote a blog post some while ago, you can find it here.

Create PDF from a list of images

Install FPDF for Python:

pip install fpdf

Now you can use the same logic:

from fpdf import FPDF
pdf = FPDF()
# imagelist is the list with all image filenames
for image in imagelist:
pdf.add_page()
pdf.image(image,x,y,w,h)
pdf.output("yourfile.pdf", "F")

You can find more info at the tutorial page or the official documentation.

How do I create a simple pdf file in python?

You may use wkhtmltopdf. It is a command line utility that uses Webkit to convert html to pdf.

You can generate your data as html and style it with css if you want, then, use wkhtmltopdf to generate the pdf file.

Merge PDF files

Use Pypdf or its successor PyPDF2:

A Pure-Python library built as a PDF toolkit. It is capable of:

  • splitting documents page by page,
  • merging documents page by page,

(and much more)

Here's a sample program that works with both versions.

#!/usr/bin/env python
import sys
try:
from PyPDF2 import PdfFileReader, PdfFileWriter
except ImportError:
from pyPdf import PdfFileReader, PdfFileWriter

def pdf_cat(input_files, output_stream):
input_streams = []
try:
# First open all the files, then produce the output file, and
# finally close the input files. This is necessary because
# the data isn't read from the input files until the write
# operation. Thanks to
# https://stackoverflow.com/questions/6773631/problem-with-closing-python-pypdf-writing-getting-a-valueerror-i-o-operation/6773733#6773733
for input_file in input_files:
input_streams.append(open(input_file, 'rb'))
writer = PdfFileWriter()
for reader in map(PdfFileReader, input_streams):
for n in range(reader.getNumPages()):
writer.addPage(reader.getPage(n))
writer.write(output_stream)
finally:
for f in input_streams:
f.close()
output_stream.close()

if __name__ == '__main__':
if sys.platform == "win32":
import os, msvcrt
msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)
pdf_cat(sys.argv[1:], sys.stdout)

Creating and writing to a pdf file in Python

you could install the fpdf library and then:

from fpdf import FPDF

pdf = FPDF()
pdf.add_page()
pdf.set_xy(0, 0)
pdf.set_font('arial', 'B', 13.0)
pdf.cell(ln=0, h=5.0, align='L', w=0, txt="Hello", border=0)
pdf.output('test.pdf', 'F')

How can we generate pdf file in python using dataframe

If you're willing to use LaTeX, you can accomplish this pretty easily with the tabulate library.

Here's some code I whipped up that will print out TeX that can be piped into something like pdflatex:

import tabulate    

headers=['id', 'name', 'medicine', 'qty', 'price']
items=[[1, 'Bob', 'advil', 4, 9.99],
[2, 'Mary', 'tylenol', 10, 5.99],
[3, 'Malcolm', 'vitamin_d', 1, 13.99]]
report_name='Report Name'
your_name='Foo Bar'

document = f'''
\\documentclass[10pt]{{article}}
\\title{{{report_name}}}
\\author{{{your_name}}}
\\begin{{document}}
\\maketitle
\\begin{{center}}
{tabulate.tabulate(items, headers, tablefmt='latex')}
\\end{{center}}
\\end{{document}}
'''

print(document)

Output:

\documentclass[10pt]{article}
\title{Report Name}
\author{Foo Bar}
\begin{document}
\maketitle
\begin{center}
\begin{tabular}{rllrr}
\hline
id & name & medicine & qty & price \\
\hline
1 & Bob & advil & 4 & 9.99 \\
2 & Mary & tylenol & 10 & 5.99 \\
3 & Malcolm & vitamin\_d & 1 & 13.99 \\
\hline
\end{tabular}
\end{center}
\end{document}

I was able to turn this into a file "report.pdf" with one command (I'm on Linux so if you're on Windows this might be a bit different)

$ python3 report.py | pdflatex -jobname report

The "-jobname" flag on pdflatex will have the output file match the value of the flag (here, "report" generates "report.pdf")

This is what the output was for me, but LaTeX is very customizable and there's a lot that can be done to make this look a bit nicer.

See: https://www.overleaf.com/learn/latex/Tables

The TeX-formatted report

Most pythonic way to create PDF Files from JSON with Styling?

I don't know about the most Pythonic way, but I would do it like so:

  • Figure out what language you want to define the final output in. Since it has a lot of complex formatting, I'd say you want HTML (probably with CSS) or Latex.
  • Write a Jinja template in this target language, with variables in the appropriate places.
  • Plug the values from your JSON into Jinja to render the template and construct the HTML/Latex of every question.
  • Use pandoc to convert the HTML to PDF.

While this is quite a few technologies, they are all well suited to their task and easier to work with. The problem here is that you want to build PDFs with very specific layout. However PDFs are very complex and not all libraries implement it well - but pandoc does.



Related Topics



Leave a reply



Submit