Serving Dynamically Generated Zip Archives in Django

Serving dynamically generated ZIP archives in Django

The solution is as follows.

Use Python module zipfile to create zip archive, but as the file specify StringIO object (ZipFile constructor requires file-like object). Add files you want to compress. Then in your Django application return the content of StringIO object in HttpResponse with mimetype set to application/x-zip-compressed (or at least application/octet-stream). If you want, you can set content-disposition header, but this should not be really required.

But beware, creating zip archives on each request is bad idea and this may kill your server (not counting timeouts if the archives are large). Performance-wise approach is to cache generated output somewhere in filesystem and regenerate it only if source files have changed. Even better idea is to prepare archives in advance (eg. by cron job) and have your web server serving them as usual statics.

Django - Create A Zip of Multiple Files and Make It Downloadable

I've posted this on the duplicate question which Willy linked to, but since questions with a bounty cannot be closed as a duplicate, might as well copy it here too:

import os
import zipfile
import StringIO

from django.http import HttpResponse

def getfiles(request):
# Files (local path) to put in the .zip
# FIXME: Change this (get paths from DB etc)
filenames = ["/tmp/file1.txt", "/tmp/file2.txt"]

# Folder name in ZIP archive which contains the above files
# E.g [thearchive.zip]/somefiles/file2.txt
# FIXME: Set this to something better
zip_subdir = "somefiles"
zip_filename = "%s.zip" % zip_subdir

# Open StringIO to grab in-memory ZIP contents
s = StringIO.StringIO()

# The zip compressor
zf = zipfile.ZipFile(s, "w")

for fpath in filenames:
# Calculate path for file in zip
fdir, fname = os.path.split(fpath)
zip_path = os.path.join(zip_subdir, fname)

# Add file, at correct path
zf.write(fpath, zip_path)

# Must close zip for all contents to be written
zf.close()

# Grab ZIP file from in-memory, make response with correct MIME-type
resp = HttpResponse(s.getvalue(), mimetype = "application/x-zip-compressed")
# ..and correct content-disposition
resp['Content-Disposition'] = 'attachment; filename=%s' % zip_filename

return resp

Can I set the permissions of the contents of a dynamically generated zip file in Google App Engine?

Yup, see the docs for the Python zipfile module. Specifically, the signature of the writestr method, which is:

ZipFile.writestr(zinfo_or_arcname,
bytes[, compress_type])

The first argument can be the filename, or a ZipInfo object, which allows you to specify information about the file to be stored. I believe the relevant field to set to change the permissions of the file is the external_attr, but some experimentation reading existing zip files may be required to determine this.

Create zip archive for instant download

Check this Serving dynamically generated ZIP archives in Django

Streaming zip in Django for large non-local files possible?

This sounds to me like a perfect use case to be solved queueing jobs and processing them in the background.

Advantages:

  1. since retrieving and zipping the files requires a variable (and possibly significant) time, that should be decoupled from the HTTP request/response cycle;
  2. multiple jobs will be serialized for execution in the task queue.

The second advantage is particularly desirable since you’re prepared to receive multiple concurrent requests.

I would also consider using a “task” Django model with a FileField to be used as a container for the resulting zip file, so it will be statically and efficiently served by Nginx from the media folder.
As an additional benefit, you will monitor what’s going on directly from he Django admin user interface.

I’ve used a similar approach in many Django project, and that has proven to be quite robust and manageable; you might want to take a quick look at the following django app I’m using for that: https://github.com/morlandi/django-task

To summarize:

  • write a “task” Model with a FileField to be used as a container for the zipped result
  • upon receiving a request, insert a new record in the “task” table, and a new job in the background queue
  • the background job is responsible for collecting resources and zipping them; this is common Python stuff
  • on completion, save the result in the FileField and send a notification to the user
  • the user will follow the received url to download the zip file as a static file

Downloading Zip file through Django, file decompresses as cpzg

Take a closer look at the example provided in this answer.

Notice a StringIO is opened, the zipFile is called with the StringIO as a "File-Like Object", and then, crucially, after the zipFile is closed, the StringIO is returned in the HTTPResponse.

# Open StringIO to grab in-memory ZIP contents
s = StringIO.StringIO()

# The zip compressor
zf = zipfile.ZipFile(s, "w")

# Grab ZIP file from in-memory, make response with correct MIME-type
resp = HttpResponse(s.getvalue(), mimetype = "application/x-zip-co mpressed")

I would recommend a few things in your case.

  1. Use BytesIO for forward compatibility
  2. Take advantage of ZipFile's built in context manager
  3. In your Content-Disposition, be careful of "jobNumber" vs "JobNumber"

Try something like this:

def print_nozzle_txt(request):
JobNumber = "123"
phrase = "A, B, C, D, EF, G"
words = phrase.split(",")
x =0

byteStream = io.BytesIO()

with zipfile.ZipFile(byteStream, mode='w', compression=zipfile.ZIP_DEFLATED,) as zf:
for word in words:
word.encode(encoding="UTF-8")
x = x + 1
zf.writestr(JobNumber + "_" + str(x) + ".txt", word)

response = HttpResponse(byteStream.getvalue(), content_type='application/x-zip-compressed')
response['Content-Disposition'] = "attachment; filename='" + str(JobNumber) + "_AHTextFiles.zip'"
return response

Download some files as one zipped archive

The call to tempfile.TemporaryFile() returns a file handle, not a file name.

This will close the file handle:

 archive.close()

Afterwards, the handle can't be used anymore. In fact, the file will be deleted from disk by closing it: https://docs.python.org/3/library/tempfile.html#tempfile.TemporaryFile

So even if you could ask the result of tempfile.TemporaryFile() for its name, that wouldn't help.

What you need to do is ask for a temporary file name (instead of just the file). Then create a handle for this file name, write the data, close the handle. For the request, create a new file handle using the name.

The method tempfile.NamedTemporaryFile() should work for you. Make sure you pass the option delete=False. You can get the path to the file from temp.name. See https://docs.python.org/3/library/tempfile.html#tempfile.NamedTemporaryFile

This will leave the file on disk after the response has been sent. To fix this, extend the FileWrapper and overwrite the close() method:

 class DeletingFileWrapper(FileWrapper):
def close(self):
# First close the file handle to avoid errors when deleting the file
super(DeletingFileWrapper,self).close()

os.remove(self.filelike.name)

If the ZIP file is big, you also want to use StreamingHttpResponse instead of HttpResponse since the latter will read the whole file into memory at once.

Update

You're still using an illegal (closed) file handle here: FileWrapper(temp)

Correct code would be:

wrapper = DeletingFileWrapper(open(temp.name, 'b'))

And you need to use a method which takes a file name to determine the length because open(temp.name).tell() always returns 0. Check the os module.

See also:

  • Serving dynamically generated ZIP archives in Django
  • Django Filewrapper memory error serving big files, how to stream


Related Topics



Leave a reply



Submit