Unzip (Zip, Tar, Tag.Gz) Files With Ruby

unzip (zip, tar, tag.gz) files with ruby

To extract files from a .tar.gz file you can use the following methods from packages distributed with Ruby:

require 'rubygems/package'
require 'zlib'
tar_extract = Gem::Package::TarReader.new(Zlib::GzipReader.open('Path/To/myfile.tar.gz'))
tar_extract.rewind # The extract has to be rewinded after every iteration
tar_extract.each do |entry|
puts entry.full_name
puts entry.directory?
puts entry.file?
# puts entry.read
end
tar_extract.close

Each entry of type Gem::Package::TarReader::Entry points to a file or directory within the .tar.gz file.

Similar code can be used (replace Reader with Writer) to write files to a .tar.gz file.

Extract multiple files from gzip in ruby

Actually i have a multiple .txt files in a .gz file. I would like to extract all the .txt files from .gz file.

gzip cannot contain multiple files together. It only works with one file.

If you want to compress multiple files, you first need to tar them together, and then gzip the resulting .tar file, which does not appear to the case with the file you are using.

If you can read the contents of the sample.gz with the code you provided, this is further proof you have only one file inside. You can also try gunzip sample.gz from the command-line to again prove it contains only one file.

EDIT:

If you want the code to output an uncompressed .txt file:

output_file = File.open('sample.txt', 'w')

gz_extract = Zlib::GzipReader.open("sample.gz")
gz_extract.each_line do |extract|
output_file.write(extract)
end

output_file.close

Download and write .tar.gz files without corruption

I've successfully downloaded and extracted GZip files with this code:

require 'open-uri'
require 'zlib'

open('tarball.tar', 'w') do |local_file|
open('http://github.com/jashkenas/coffee-script/tarball/master/tarball.tar.gz') do |remote_file|
local_file.write(Zlib::GzipReader.new(remote_file).read)
end
end

what ruby gem should I use to handle tar archive manipulation?

I ended up giving up with using a gem to manipulate the tar archives, and just doing it by shelling out to the commandline.

`cd #{container} && tar xvfz sdk.tar.gz`    
`cd #{container} && tar xvfz Wizard.tar.gz`

#update the framework packaged with the wizard
FileUtils.rm_rf(container + "/Wizard.app/Contents/Resources/SDK.bundle")
FileUtils.rm_rf(container + "/Wizard.app/Contents/Resources/SDK.framework")
FileUtils.mv(container + "/resources/SDK.bundle", container + "/Wizard.app/Contents/Resources/")
FileUtils.mv(container + "/resources/SDK.framework", container + "/Wizard.app/Contents/Resources/")

config_plist = render_to_string({
file: 'site/_wizard_config',
layout: false,
locals: { app_id: @version.app.id },
formats: 'xml'
})

File.open(container + "/Wizard.app/Contents/Resources/Configuration.plist", 'w') { |file| file.write( config_plist ) }

`cd #{container} && rm Wizard.tar.gz`
`cd #{container} && tar -cvf Wizard.tar 'Wizard.app'`
`cd #{container} && gzip Wizard.tar`

All these backticks make me feel like I'm writing Perl again.

Trying to unzip a 600mb tgz with ruby gives out of integer range error

You can likely fix this by changing this:

  File.open(file_path, 'wb') do |f|
f.write(entry.read)
end

to a loop, where you call entry.read with a parameter, for the max number of bytes to process in that iteration. You might have to split into two calls, as calling entry.read may return nil, indicating there is no more data to process.

[Fluentd]How to Unzip files in fluentd

Looking at your requirement, you can still achieve it by using in_exec module,
What you have to do is, to simply create a shell script which accepts path to look for .gz files and the wildcard pattern to match file names. And inside the shell script you can unzip files inside the folder_path that was passed with the given wildcard pattern. Basically your shell execution should look like:

sh unzip.sh <folder_path_to_monitor> <wildcard_to_files>

And use the above command in in_exec tag in your config. And your config will look like:

<source>
@type exec
format json
tag unzip.sh
command sh unzip.sh <folder_path_to_monitor> <wildcard_to_files>
run_interval 10s
</source>


Related Topics



Leave a reply



Submit