Zip File and Print to Stdout

Zip file and print to stdout

The zip program from Info-ZIP (the one usually found on Linux systems) allows generating ZIP files into the standard output, when you use - as the name of the file. For example you could send a zip file to port 8787 on a remote host with this command:

zip -r - files_to_be_archived | nc remotehost 8787

All of this is documented on the zip command manual page.

Print Archive::Zip zip file to Apache2::RequestIO object

Something like this should help...

use Archive::Zip;
my $zip = Archive::Zip->new();
#create your zip here

use IO::Scalar;
my $memory_file = ''; #scalar as a file
my $memfile_fh = IO::Scalar->new(\$memory_file); #filehandle to the scalar

# write to the scalar $memory_file
my $status = $zip->writeToFileHandle($memfile_fh);
$memfile_fh->close;

#print with apache
#$r->content_type(".......");
$r->print($memory_file); #the content of a file-in-a-scalar

EDIT:
The above is obsoloted.
from the Archive::Zip docs:

Try to avoid IO::Scalar

One of the most common ways to use Archive::Zip is to generate Zip
files in-memory. Most people have use IO::Scalar for this purpose.

Unfortunately, as of 1.11 this module no longer works with IO::Scalar
as it incorrectly implements seeking.

Anybody using IO::Scalar should consider porting to IO::String, which
is smaller, lighter, and is implemented to be perfectly compatible
with regular seekable filehandles.

Support for IO::Scalar most likely will not be restored in the future,
as IO::Scalar itself cannot change the way it is implemented due to
back-compatibility issues.

Capturing stdout to zip and interrupting using CTRL-C gives a corrupted zip file

When you press Ctrl+C the shell sends SIGINT to the last process in the pipeline, which is gzip here. gzip terminates and the next time prog writes into stdout it receives SIGPIPE.

You need to send SIGINT to prog for it to flush its stdout and exit (provided you installed the signal handler as you did), so that gzip receives all of its output and then terminates.


You can run your pipeline as follows:

prog | setsid gzip > file.gz & wait

It uses shell job control feature to start the pipeline in the background (that & symbol). Then it waits for the job to terminate. On Ctrl+C SIGINT is sent to the foreground process which is the shell in wait and all processes in the same terminal process group (unlike when the pipeline is in the foreground and SIGINT is sent only to the last process in the pipeline). prog is in that group. But gzip is started with setsid to place it into another group, so that it doesn't receive SIGINT but rather terminates when its stdin is closed when prog terminated.

How to gzip a file while also printing compressed contents to stdout

This does what you are asking for in one line:

ls file*.txt | xargs -n1 -I'{}' bash -c 'cat {} | gzip - | tee {}.gz >> compressed.gz; touch {}.gz -r {}'

Each input file is read from disk only once, and the compressed version is saved twice; once in its file*.txt.gz entry, once in the catch-all compressed.gz file. Finally, it adjusts the timestamp of the gzipped file after compression.

Note that this does not delete the original txt file. To remove each file after compressing:

ls file*.txt | xargs -n1 -I'{}' bash -c 'cat {} | gzip - | tee {}.gz >> compressed.gz; touch {}.gz -r {}; rm {}'

Tested on Linux with the GNU versions of ls, xargs, bash, cat, gzip, tee, touch and rm.

Python ZipFile is zipping part of the text files

For each Popen object you create, you must call the wait method on it so that the process completes before you start zipping up the file.

process = subprocess.Popen(cmd, stdout = f)
process.wait()

How to print the content of zipped gzip'd files

I created a zip file containing a gzip'ed PDF file I grabbed from the web.

I ran this code (with two small changes):

1) Fixed indenting of everything under the def statement (which I also corrected in your Question because I'm sure that it's right on your end or it wouldn't get to the problem you have).

2) I changed:

            zfiledata = zfile.open(name)
print("start for file ", name)
with gzip.open(zfiledata,'r') as gzfile:
print("done opening")
filecontent = gzfile.read()
print("done reading")
print(filecontent)

to:

            print("start for file ", name)
with gzip.open(name,'rb') as gzfile:
print("done opening")
filecontent = gzfile.read()
print("done reading")
print(filecontent)

Because you were passing a file object to gzip.open instead of a string. I have no idea how your code is executing without that change, but it was crashing for me until I fixed it.

EDIT: Adding link to GZIP docs from James R's answer --

Also, see here for further documentation:

http://docs.python.org/2/library/gzip.html#examples-of-usage

END EDIT

Now, since my gzip'ed file is small, the behavior I observe is that is pauses for about 3 seconds after printing done reading, then outputs what is in filecontent.

I would suggest adding the following debugging line after your print "done reading" -- print len(filecontent). If this number is very, very large, consider not printing the entire file contents in one shot.

I would also suggest reading this for more insight into what I expect is your problem: Why is printing to stdout so slow? Can it be sped up?

EDIT 2 - an alternative if your system does not handle file io on zip files, causing no such file errors in the above:

def parseSTS(afile):
import zipfile
import zlib
import gzip
import io
with zipfile.ZipFile(afile, 'r') as archive:
for name in archive.namelist():
if name.endswith('.gz'):
bfn = archive.read(name)
bfi = io.BytesIO(bfn)
g = gzip.GzipFile(fileobj=bfi,mode='rb')
qqq = g.read()
print qqq

parseSTS('t.zip')

Print a file content from a tar.gz archive

You can use vim instead, and then browse using your cursor:

vim openjdk-17.0.2_linux-aarch64_bin.tar.gz

Alternatively, have a look at this thread.



Related Topics



Leave a reply



Submit