How to Prevent Tar from Overwriting an Existing Archive

How do I prevent tar from overwriting an existing archive?

I created the file ~/scripts/tar.sh:

#!/bin/bash

if [ -f $1 ]; then
echo "Oops! backup file was already here."
exit
fi
tar -cpvzf $1 $2 $3 $4 $5

Now I just have to type:

~/scripts/tar.sh ~/Backup/backup_file_name_`date +"%Y-%m-%d"`_a.tar.gz directory_to_backup/

And the backup file is created if the file doesn't exist.

what does 'tar --overwrite' actually do (or not do)?

There are a few subtleties, but in general, here's the difference:

By default, "tar" tries to open output files with the flags O_CREAT | O_EXCL. If the file exists, this will fail, after which "tar" will retry by first trying to delete the existing file and then re-opening with the same flags (i.e., creating a new file).

In contrast, with the --overwrite option, "tar" tries to open output files with the flags O_CREAT | O_TRUNC. If the file exists, it will be truncated to zero size and overwritten.

The main implication is that "tar" by default will delete and re-create existing files, so they'll get new inode numbers. With --overwrite, the inode numbers won't change:

$ ls -li foo
total 0
5360222 -rw-rw-r-- 1 buhr buhr 0 Jun 26 15:16 bar
$ tar -cf foo.tar foo
$ tar -xf foo.tar # inode will change
$ ls -li foo
total 0
5360224 -rw-rw-r-- 1 buhr buhr 0 Jun 26 15:16 bar
$ tar --overwrite -xf foo.tar # inode won't change
$ ls -li foo
total 0
5360224 -rw-rw-r-- 1 buhr buhr 0 Jun 26 15:16 bar
$

This also means that, for each file overwritten, "tar" by default will need three syscalls (open, unlink, open) while --overwrite will need only one (open with truncation).

How to avoid overwrite of files

You can use tarGzipFile.getmembers() to list the files in the archive, and pass members= to extractall with only those files you wish to extract (i.e., excluding files already existing). os.path.exists( ) can be used to check for file existence.

when adding files to tar the tar overwrites itself halfway through the command

find's -exec predicate may run the command multiple times if the arguments are too long for a single command line. Create an empty archive first and use tar to append to that archive instead of creating it.



Related Topics



Leave a reply



Submit