Pg_Dump Not Been Killed Inside Script with Kill Command

Track pg_dump process id and trigger an error while it fails

You use wait <pid> to recover the return code of a background process.

See: Bash man-page wait:

wait [-fn] [id ...]

Wait for each specified child process and return its termination status. Each id may be a process ID or a job specification; if a job spec is given, all processes in that job's pipeline are waited for.

If id is not given, all currently active child processes are waited for, and the return status is zero.

If the -n option is supplied, wait waits for any job to terminate and returns its exit status.

If the -f option is supplied, and job control is enabled, wait forces id to terminate before returning its status, instead of returning when it changes status.

If id specifies a non-existent process or job, the return status is 127. Otherwise, the return status is the exit status of the last process or job waited for.

Lets implement the background wait, print dots while the pg_dump runs, and check the return code of the pg_dump:

#!/usr/bin/env bash

function term_bg_jobs() {
  # Registered as a SIGINT handler to make sure that
  # we don't leave running background jobs behind
  # when this script is interrupted with Ctrl+C
  local -a job_ids
  read -r -d '' -a job_ids < <(jobs -p)
  [ "${#job_ids[@]}" -gt 0 ] && kill "${job_ids[@]}" 2>/dev/null 
}

function dumping() {
  local -i pg_dump_pid pg_dump_rc dot_pid

  # Start pg_dump as a background task
  pg_dump --host=xxx --dbname=xxx --port=xxx --username=xxx -C --file=xxx.sql --table=xxx &
  pg_dump_pid=$! # Get the PID of background pg_dump
  printf $"%s in progress, please wait:\n" 'pg_dump'
  while :; do # while true infinite loop
    printf '.' # print a dot
    sleep 1    # every seconds
  done &# run in the background
  dot_pid=$!           # save the background process id of the dot print

  wait -f $pg_dump_pid # wait for the dump to finish
  pg_dump_rc=$?        # save return code of the dump
  kill $dot_pid        # stop the dot print that was running in the background
  echo                 # just for newline after the dots

  if [[ $pg_dump_rc -gt 0 && $pg_dump_rc -lt 127 ]]; then
    printf $"%s ended with error code %d\n" 'pg_dump' "$pg_dump_rc"
  fi
}

# Install our Ctrl+C trap handler to terminate background jobs
trap term_bg_jobs INT

dumping

taking tables to be dumped as options in bash script to take dump of postgresql database

Since you don't have any other non-optional arguments, let those be the tables.

tables=()
for table in "$@"; do
    tables+=(-t "$table")
done

pg_dump "${tables[@]}"

dbbackup specify which pg_dump to use

If you have made a non-root installation of postgreSQL, say with the user nonrootuser, then you should find psql as well as pg_dump for this installation under /home/nonrootuser/postgres/bin/. This is the pg_dump you want to use.

Dbbackup allows you to specify the connector it uses to create the backup.
In particular it allows you to specify the dump command (DUMP_CMD).
To specify the connector, add the following block to your settings.py:

import os  # if not yet imported
DBBACKUP_CONNECTORS = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'NAME': 'xxx',
        'USER': 'xxx',
        'PASSWORD': 'xxx',
        'HOST': 'xxx',
        'PORT': 'xxx',
        'DUMP_CMD': os.path.join(
            os.environ["HOME"],
            'nonrootuser',
            'bin',
            'pg_dump'
            )
        }
    }

Replace the xxx with your specific values.

Hope that helps!

What are the security pros/cons of a remote PostgreSQL database dump using SSH tunneling or using a local pg_dump client?

There are two things that "ssh tunnelling" can mean here. One of them is OK, one is horrid.

You can tunnel TCP-in-TCP, running pg_dump on your local machine, connecting to a port on localhost that's forwarded to the remote PostgreSQL.

ssh -n -L 5433:localhost:5432 remote-box
pg_dump -p 5433 localhost -Fc -f dump-file.pgdump

That's what you seem to mean. Don't do this. Performance is generally not great and there's no good reason to do it.

You can use ssh to stream pg_dump output from a pg_dump invoked on the remote machine, e.g.

ssh remote-box pg_dump -Fc -f - > local-dump-file.pgdump

This performs better and avoids issues with dangling tunnels.

You can also make a direct libpq connection with pg_dump. If you have SSL set up on the remote instance this is the best option.

pg_dump "host=remote-box sslmode=require user=blah" -Fc -f dumpfile.pgdump

... but if you don't have SSL, you probably should run pg_dump over ssh and stream the results.

Pg_Dump Not Been Killed Inside Script with Kill Command

Track pg_dump process id and trigger an error while it fails

taking tables to be dumped as options in bash script to take dump of postgresql database

dbbackup specify which pg_dump to use

What are the security pros/cons of a remote PostgreSQL database dump using SSH tunneling or using a local pg_dump client?

Related Topics

Leave a reply