Bash-script printing a pdf to a pdf in Linux
You could try putting your PDF files through Ghostscript. I have found that this is enough to fix many problematic PDFs.
gs -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=output.pdf input.pdf
(The same command can also be used to merge several PDF files into one, just specify multiple input files.)
Output list of pdf files as one pdf using pdftk bash script
Something like this:
#!/bin/bash
files=()
add() {
files+=("'""$1""'")
}
add "file1.pdf"
#add "file2.pdf"
add "file3.pdf"
add "file with spaces.pdf"
echo "${files[*]}"
Naturally, substitute the proper pdftk
command for echo
.
Edit 2
This new "version" will work better with filenames containing spaces.
Edit 3
To hand the files over to the command, it seems something like the following will do the trick:
bash -c "stat $(echo "${files[*]}")"
Print contents of a PDF to the command line
On the man pages for pdftotext
, I found this:
pdftotext [options] [PDF-file [text-file]]
Description
Pdftotext converts Portable Document Format (PDF) files to plain text.Pdftotext reads the PDF file, PDF-file, and writes a text file, text-file. If text-file is not specified, pdftotext converts file.pdf to file.txt. If text-file is '-', the text is sent to stdout.
Thus to output to stdout
in order to pipe to grep
use this:
pdftotext mydoc.pdf - | grep mysearchterm
How to print out pdf file with script generated highlighted output?
First you have to convert the colored shell output to html then to pdf.
Use the ansi2html.sh
from here
you can try smth like this
cat myapp_log | ansi2html.sh -p > myapp_log.html
html2any myapp_log.html file.pdf
Find string inside pdf with shell
As nicely pointed by Simon, you can simply convert the pdf
to plain text using pdftotext
, and then, just search for what you're looking for.
After conversion, you may use grep
, bash regex, or any variation you want:
while read line; do
if [[ ${line} =~ [0-9]{4}(-[0-9]{2}){2} ]]; then
echo ">>> Found date;";
fi
done < <(pdftotext infile.pdf -)
How to write shell script for finding number of pages in PDF?
Without any extra package:
strings < file.pdf | sed -n 's|.*/Count -\{0,1\}\([0-9]\{1,\}\).*|\1|p' \
| sort -rn | head -n 1
Using pdfinfo:
pdfinfo file.pdf | awk '/^Pages:/ {print $2}'
Using pdftk:
pdftk file.pdf dump_data | grep NumberOfPages | awk '{print $2}'
You can also recursively sum the total number of pages in all PDFs via pdfinfo as follows:
find . -xdev -type f -name "*.pdf" -exec pdfinfo "{}" ";" | \
awk '/^Pages:/ {n += $2} END {print n}'
Merge / convert multiple PDF files into one PDF
I'm sorry, I managed to find the answer myself using google and a bit of luck : )
For those interested;
I installed the pdftk (pdf toolkit) on our debian server, and using the following command I achieved desired output:
pdftk file1.pdf file2.pdf cat output output.pdf
OR
gs -q -sPAPERSIZE=letter -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=output.pdf file1.pdf file2.pdf file3.pdf ...
This in turn can be piped directly into pdf2ps.
Linux piping ( convert - pdf2ps - lp)
convert file1.pdf file2.pdf - | pdf2ps - - | lp -s
should do the job.
You send the output of the convert command to psf2ps, which in turn feeds its output to lp.
Related Topics
Program Life in Terms of Paged Segmentation Memory
What Is a Good Interface for a Linux Device Driver for a Co-Processing Peripheral
Xfs - How to Not Modify Mtime When Writing to File
Bash Command Line Arguments Passed to Sed via Ssh
What Are Coding Conventions for Using Floating-Point in Linux Device Drivers
Dlopen with Two Shared Libraries, Exporting Symbols
Print Bash Script Result Behind Prompt in The Next Line
Where M Flag and O Flag Will Be Stored in Linux
Why Does Bash Not Stop on Error for Failures in Sequence of Short-Circuited Commands
How to Install Packages in Tcl
Enabling The Vt-X Inside a Virtual Machine
How to Get The Output of at Command in Current or Another Terminal Window
How to 'Chmod -R +W' with Ant, Files and Folders
How to Find Performance of Individual Functions in a Process Using Perf Tool
Gunicorn Does Not Start After Boot
What's a Simple Method to Dump Pipe Input to a File? (Linux)