How to uncompress a tar.gz in another directory
gzip -dc archive.tar.gz | tar -xf - -C /destination
or, with GNU tar
tar xzf archive.tar.gz -C /destination
Extract a specific folder to specific directory from a tar.gz
Ok I figured it out!
Basically I can just use the strip command to remove the x number of leading directories. In this case, my command would look like this:
tar -xzf backup.tar.gz --strip-components=3 -C a/b/m
That removed the first three path directories from my archive (backup.tar.gz : a/b/c/d) before extracting it to the desctination directory.
Now it looks like this: a/b/m+d
Extracting specific folders in multiple tar.gz files recursively
Possible solution
After tinkering around with the above shell code I managed to extract only the csv folders by adding the csv wildcard command:
for f in *.tar.gz; do tar -xzvf "$f" "*csv*" -C ../synthea_output; done
The output now looks like this:
|-- output_1
| `-- csv
| |-- allergies.csv
| |-- careplans.csv
| |-- conditions.csv
| |-- encounters.csv
| |-- immunizations.csv
| |-- medications.csv
| |-- observations.csv
| |-- patients.csv
| `-- procedures.csv
|-- output_10
| `-- csv
| |-- allergies.csv
| |-- careplans.csv
| |-- conditions.csv
| |-- encounters.csv
| |-- immunizations.csv
| |-- medications.csv
| |-- observations.csv
| |-- patients.csv
| `-- procedures.csv
|-- output_11
| `-- csv
| |-- allergies.csv
| |-- careplans.csv
| |-- conditions.csv
| |-- encounters.csv
| |-- immunizations.csv
| |-- medications.csv
| |-- observations.csv
| |-- patients.csv
| `-- procedures.csv
|-- output_12
| `-- csv
| |-- allergies.csv
| |-- careplans.csv
| |-- conditions.csv
| |-- encounters.csv
| |-- immunizations.csv
| |-- medications.csv
| |-- observations.csv
| |-- patients.csv
| `-- procedures.csv
|-- output_2
| `-- csv
| |-- allergies.csv
| |-- careplans.csv
| |-- conditions.csv
| |-- encounters.csv
| |-- immunizations.csv
| |-- medications.csv
| |-- observations.csv
| |-- patients.csv
| `-- procedures.csv
|-- output_3
| `-- csv
| |-- allergies.csv
| |-- careplans.csv
| |-- conditions.csv
| |-- encounters.csv
| |-- immunizations.csv
| |-- medications.csv
| |-- observations.csv
| |-- patients.csv
| `-- procedures.csv
|-- output_4
| `-- csv
| |-- allergies.csv
| |-- careplans.csv
| |-- conditions.csv
| |-- encounters.csv
| |-- immunizations.csv
| |-- medications.csv
| |-- observations.csv
| |-- patients.csv
| `-- procedures.csv
|-- output_5
| `-- csv
| |-- allergies.csv
| |-- careplans.csv
| |-- conditions.csv
| |-- encounters.csv
| |-- immunizations.csv
| |-- medications.csv
| |-- observations.csv
| |-- patients.csv
| `-- procedures.csv
|-- output_6
| `-- csv
| |-- allergies.csv
| |-- careplans.csv
| |-- conditions.csv
| |-- encounters.csv
| |-- immunizations.csv
| |-- medications.csv
| |-- observations.csv
| |-- patients.csv
| `-- procedures.csv
|-- output_7
| `-- csv
| |-- allergies.csv
| |-- careplans.csv
| |-- conditions.csv
| |-- encounters.csv
| |-- immunizations.csv
| |-- medications.csv
| |-- observations.csv
| |-- patients.csv
| `-- procedures.csv
|-- output_8
| `-- csv
| |-- allergies.csv
| |-- careplans.csv
| |-- conditions.csv
| |-- encounters.csv
| |-- immunizations.csv
| |-- medications.csv
| |-- observations.csv
| |-- patients.csv
| `-- procedures.csv
`-- output_9
`-- csv
|-- allergies.csv
|-- careplans.csv
|-- conditions.csv
|-- encounters.csv
|-- immunizations.csv
|-- medications.csv
|-- observations.csv
|-- patients.csv
`-- procedures.csv
Extract tar archive excluding a specific folder and its contents
You can use '--exclude' to omit a folder:
tar -xf archive.tar -C /home/user/target/folder" --exclude="folderC"
How to extract a single file from tar to a different directory?
The problem is that your arguments are in incorrect order. The single file argument must be last.
E.g.
$ tar xvf test.tar -C anotherDirectory/ testfile1
should do the trick.
PS: You should have asked this question on superuser instead of SO
How do I extract only the desired files from tar.gz?
I tested it with the following folder structure:
data/
data/a
data/a/ANOTHER_SNAPSHOT.jar
data/b
data/c
data/c/SNAPSHOT.jar
data/d
data/e
data/f
data/f/SNAPSHOT.jar.with.extension
data/g
data/g/SNAPSHOT.jar
data/h
The following wildcard mask works and extract only the files matching exactly SNAPSHOT.jar not SNAPSHOT.jar.with.extension and ANOTHER_SNAPSHOT.jar
tar -xf data.tar.gz --wildcards "*/SNAPSHOT.jar"
Result:
data/c/SNAPSHOT.jar
data/g/SNAPSHOT.jar
Extract files contained in archive.tar.gz to new directory named archive
Update since GNU tar 1.28:
use --one-top-level
, see https://www.gnu.org/software/tar/manual/tar.html#index-one_002dtop_002dlevel_002c-summary
Older versions need to script this. You can specify the directory that the extract is placed in by using the tar -C option.
The script below assumes that the directories do not exist and must be created. If the directories do exist the script will still work - the mkdir will simply fail.
tar -xvzf archive.tar.gx -C archive_dir
e.g.
for a in *.tar.gz
do
a_dir=${a%.tar.gz}
mkdir --parents $a_dir
tar -xvzf $a -C $a_dir
done
How to extract a number of tar.gz files to a directory?
import glob, os, re, tarfile
# Setup main paths.
tarfile_rootdir = r'D:\SPRING2019\Tarfiles'
extract_rootdir = r'D:\SPRING2019\Test'
# Process the files.
re_pattern = re.compile(r'\A(\w+)-\d+[a-zA-Z]0{0,5}(\d+)')
for tar_file in glob.iglob(os.path.join(tarfile_rootdir, '*.tgz')):
# Get the parts from the base tgz filename using regular expressions.
part = re.findall(re_pattern, os.path.basename(tar_file))[0]
# Build the extraction path from each part.
extract_path = os.path.join(extract_rootdir, *part)
# Perform the extract of all files from the zipfile.
with tarfile.open(tar_file, 'r:gz') as r:
r.extractall(extract_path)
This code is based similar to the
answer
to your last question. Due to uncertain information on
directory structure, I will provide a structure as an
example.
TGZ files in D:\SPRING2019\Tarfiles
:
DZB1216-500058L002001.tgz
DZB1216-500058L003001.tgz
Extract directory structure in D:\SPRING2019\Test
:
DZB1216
2001
3001
The .tgz
file paths are retrieved with glob
.
From example filename: DZB1216-500058L002001.tgz
,
the regular expression will capture 2 groups:
\A
is an anchor at the start of the string.
This is not a group.(\w+)
to matchDZB1216
.
This is the 1st group.-\d+[a-zA-Z]0{0,5}
matches up to the next group.
This is not a group.(\d+)
to match2001
.
This is the 2nd group.
The extraction path is joined using the values ofextract_rootdir
, DZB1216
, and 2001
.
This results in D:\SPRING2019\Test\DZB1216\2001
as the extraction path.
The use of tarfile
will extract all from the .tgz
file.
Related Topics
How to You Configure The Command Prompt in Linux to Show Current Directory
Using "Touch" to Create Directories
How to Make Ssh Command Execution to Timeout
Killing Process in Shell Script
Do_Install Error While Running Custom Bitbake in Poky Build
(13)Permission Denied: Access to /Cgi-Bin/Test.Cgi Denied
Trying to Launch an External Editor from Within a Go Program
How to Install Devtools Package for R Studio Mounted on Linux Redhat Server
How to Create a Real Thread with Clone() on Linux
How to Find Out What Program's on The Other End of a Local Socket
Setting Process Name (As Seen by 'Ps') in Go
How to Access Raspberry Pi Qemu Vm via Network
How to Touch a File and Mkdir If Needed in One Line