Zlib in Ruby to Uncompress .Gz

Zlib in Ruby to uncompress .gz

Zlib::GzipReader works like most IO-like classes do in Ruby. You have an open call, and when you pass a block to it, the block will receive the IO-like object. Think of it is convenient way of doing something with a file or resource for the duration of the block.

But that means that in your example gz is an IO-like object, and not actually the contents of the gzip file, as you expect. You still need to read from it to get to that. The simplest fix would then be:

g.write(gz.read)

Note that this will read the entire contents of the uncompressed gzip into memory.

If all you're really doing is copying from one file to another, you can use the more efficient IO.copy_stream method. Your example might then look like:

Zlib::GzipReader.open('PRIDE_Exp_Complete_Ac_1015.xml.gz') do | input_stream |
File.open("PRIDE_Exp_Complete_Ac_1015.xml", "w") do |output_stream|
IO.copy_stream(input_stream, output_stream)
end
end

Behind the scenes, this will try to use the sendfile syscall available in some specific situations on Linux. Otherwise, it will do the copying in fast C code 16KB blocks at a time. This I learned from the Ruby 1.9.1 source code.

How to decompress Gzip string in ruby?

The above method didn't work for me.

I kept getting incorrect header check (Zlib::DataError) error. Apparently it assumes you have a header by default, which may not always be the case.

The work around that I implemented was:

require 'zlib'
require 'stringio'
gz = Zlib::GzipReader.new(StringIO.new(resp.body.to_s))
uncompressed_string = gz.read

unzip (zip, tar, tag.gz) files with ruby

To extract files from a .tar.gz file you can use the following methods from packages distributed with Ruby:

require 'rubygems/package'
require 'zlib'
tar_extract = Gem::Package::TarReader.new(Zlib::GzipReader.open('Path/To/myfile.tar.gz'))
tar_extract.rewind # The extract has to be rewinded after every iteration
tar_extract.each do |entry|
puts entry.full_name
puts entry.directory?
puts entry.file?
# puts entry.read
end
tar_extract.close

Each entry of type Gem::Package::TarReader::Entry points to a file or directory within the .tar.gz file.

Similar code can be used (replace Reader with Writer) to write files to a .tar.gz file.

Compress large file in ruby with Zlib for gzip

You can use IO#read to read a chunk of arbitrary length from the file.

require 'zlib'

Zlib::GzipWriter.open('compressed_file.gz') do |gz|
File.open(large_data_file) do |fp|
while chunk = fp.read(16 * 1024) do
gz.write chunk
end
end
gz.close
end

This will read the source file in 16kb chunks and add each compressed chunk to the output stream. Adjust the block size to your preference based on your environment.

Extract multiple files from gzip in ruby


Actually i have a multiple .txt files in a .gz file. I would like to extract all the .txt files from .gz file.

gzip cannot contain multiple files together. It only works with one file.

If you want to compress multiple files, you first need to tar them together, and then gzip the resulting .tar file, which does not appear to the case with the file you are using.

If you can read the contents of the sample.gz with the code you provided, this is further proof you have only one file inside. You can also try gunzip sample.gz from the command-line to again prove it contains only one file.

EDIT:

If you want the code to output an uncompressed .txt file:

output_file = File.open('sample.txt', 'w')

gz_extract = Zlib::GzipReader.open("sample.gz")
gz_extract.each_line do |extract|
output_file.write(extract)
end

output_file.close


Related Topics



Leave a reply



Submit