Zip file and print to stdout
The zip
program from Info-ZIP (the one usually found on Linux systems) allows generating ZIP files into the standard output, when you use -
as the name of the file. For example you could send a zip file to port 8787 on a remote host with this command:
zip -r - files_to_be_archived | nc remotehost 8787
All of this is documented on the zip command manual page.
Print Archive::Zip zip file to Apache2::RequestIO object
Something like this should help...
use Archive::Zip;
my $zip = Archive::Zip->new();
#create your zip here
use IO::Scalar;
my $memory_file = ''; #scalar as a file
my $memfile_fh = IO::Scalar->new(\$memory_file); #filehandle to the scalar
# write to the scalar $memory_file
my $status = $zip->writeToFileHandle($memfile_fh);
$memfile_fh->close;
#print with apache
#$r->content_type(".......");
$r->print($memory_file); #the content of a file-in-a-scalar
EDIT:
The above is obsoloted.
from the Archive::Zip docs:
Try to avoid IO::Scalar
One of the most common ways to use Archive::Zip is to generate Zip
files in-memory. Most people have use IO::Scalar for this purpose.Unfortunately, as of 1.11 this module no longer works with IO::Scalar
as it incorrectly implements seeking.Anybody using IO::Scalar should consider porting to IO::String, which
is smaller, lighter, and is implemented to be perfectly compatible
with regular seekable filehandles.Support for IO::Scalar most likely will not be restored in the future,
as IO::Scalar itself cannot change the way it is implemented due to
back-compatibility issues.
Capturing stdout to zip and interrupting using CTRL-C gives a corrupted zip file
When you press Ctrl+C the shell sends SIGINT
to the last process in the pipeline, which is gzip
here. gzip
terminates and the next time prog
writes into stdout
it receives SIGPIPE
.
You need to send SIGINT
to prog
for it to flush its stdout
and exit (provided you installed the signal handler as you did), so that gzip
receives all of its output and then terminates.
You can run your pipeline as follows:
prog | setsid gzip > file.gz & wait
It uses shell job control feature to start the pipeline in the background (that &
symbol). Then it wait
s for the job to terminate. On Ctrl+C
SIGINT
is sent to the foreground process which is the shell in wait
and all processes in the same terminal process group (unlike when the pipeline is in the foreground and SIGINT
is sent only to the last process in the pipeline). prog
is in that group. But gzip
is started with setsid
to place it into another group, so that it doesn't receive SIGINT
but rather terminates when its stdin
is closed when prog
terminated.
How to gzip a file while also printing compressed contents to stdout
This does what you are asking for in one line:
ls file*.txt | xargs -n1 -I'{}' bash -c 'cat {} | gzip - | tee {}.gz >> compressed.gz; touch {}.gz -r {}'
Each input file is read from disk only once, and the compressed version is saved twice; once in its file*.txt.gz entry, once in the catch-all compressed.gz file. Finally, it adjusts the timestamp of the gzipped file after compression.
Note that this does not delete the original txt file. To remove each file after compressing:
ls file*.txt | xargs -n1 -I'{}' bash -c 'cat {} | gzip - | tee {}.gz >> compressed.gz; touch {}.gz -r {}; rm {}'
Tested on Linux with the GNU versions of ls
, xargs
, bash
, cat
, gzip
, tee
, touch
and rm
.
Python ZipFile is zipping part of the text files
For each Popen
object you create, you must call the wait
method on it so that the process completes before you start zipping up the file.
process = subprocess.Popen(cmd, stdout = f)
process.wait()
How to print the content of zipped gzip'd files
I created a zip file containing a gzip'ed PDF file I grabbed from the web.
I ran this code (with two small changes):
1) Fixed indenting of everything under the def statement (which I also corrected in your Question because I'm sure that it's right on your end or it wouldn't get to the problem you have).
2) I changed:
zfiledata = zfile.open(name)
print("start for file ", name)
with gzip.open(zfiledata,'r') as gzfile:
print("done opening")
filecontent = gzfile.read()
print("done reading")
print(filecontent)
to:
print("start for file ", name)
with gzip.open(name,'rb') as gzfile:
print("done opening")
filecontent = gzfile.read()
print("done reading")
print(filecontent)
Because you were passing a file object to gzip.open instead of a string. I have no idea how your code is executing without that change, but it was crashing for me until I fixed it.
EDIT: Adding link to GZIP docs from James R's answer --
Also, see here for further documentation:
http://docs.python.org/2/library/gzip.html#examples-of-usage
END EDIT
Now, since my gzip'ed file is small, the behavior I observe is that is pauses for about 3 seconds after printing done reading
, then outputs what is in filecontent
.
I would suggest adding the following debugging line after your print "done reading" -- print len(filecontent)
. If this number is very, very large, consider not printing the entire file contents in one shot.
I would also suggest reading this for more insight into what I expect is your problem: Why is printing to stdout so slow? Can it be sped up?
EDIT 2 - an alternative if your system does not handle file io on zip files, causing no such file errors in the above:
def parseSTS(afile):
import zipfile
import zlib
import gzip
import io
with zipfile.ZipFile(afile, 'r') as archive:
for name in archive.namelist():
if name.endswith('.gz'):
bfn = archive.read(name)
bfi = io.BytesIO(bfn)
g = gzip.GzipFile(fileobj=bfi,mode='rb')
qqq = g.read()
print qqq
parseSTS('t.zip')
Print a file content from a tar.gz archive
You can use vim instead, and then browse using your cursor:
vim openjdk-17.0.2_linux-aarch64_bin.tar.gz
Alternatively, have a look at this thread.
Related Topics
How to Redirect Entire Output of Spark-Submit to a File
Ldd Shows Varied Addresses on X86 Linux
What Is The Maximum Number of Subdirectories Allowed in Ext4
How to Check If One File Is Part of Other
Swift on Linux: Make Very First Step Work
How I Install Specific Fonts on My Aws Ec2 Instance
Installing Rpostgresql on Linux
Libv4L2: Error Turning on Stream: No Space Left on Device
Gdb/Ddd Program Received Signal Sigill
How to Remove Warning: Link.Res Contains Output Sections; Did You Forget -T
Why Doesn't Time() from Time.H Have a Syscall to Sys_Time
How to Create a Core File for My Crashed Program
How to Export a Symbol from an External Module
Tickless Kernel, Isolcpus,Nohz_Full,And Rcu_Nocbs
Linux Core Dumps Are Too Large!