Best Practices to Create and Download a huge ZIP (from several BLOBs) in a WebApp
For large content that won't fit in memory at once, stream the content from the database to the response.
This kind of thing is actually pretty simple. You don't need AJAX or websockets, it's possible to stream large file downloads through a simple link that the user clicks on. And modern browsers have decent download managers with their own progress bars - why reinvent the wheel?
If writing a servlet from scratch for this, get access to the database BLOB, getting its input stream and copy content through to the HTTP response output stream. If you have Apache Commons IO library, you can use IOUtils.copy(), otherwise you can do this yourself.
Creating a ZIP file on the fly can be done with a ZipOutputStream. Create one of these over the response output stream (from the servlet or whatever your framework gives you), then get each BLOB from the database, using putNextEntry()
first and then streaming each BLOB as described before.
Potential Pitfalls/Issues:
- Depending on the download size and network speed, the request might take a lot of time to complete. Firewalls, etc. can get in the way of this and terminate the request early.
- Hopefully your users are on a decent corporate network when requesting these files. It would be far worse over remote/dodgey/mobile connections (if it drops out after downloading 1.9G of 2.0G, users have to start again).
- It can put a bit of load on your server, especially compressing huge ZIP files. It might be worth turning compression down/off when creating the
ZipOutputStream
if this is a problem. - ZIP files over 2GB (or is that 4 GB) might have issues with some ZIP programs. I think the latest Java 7 uses ZIP64 extensions, so this version of Java will write the huge ZIP correctly but will the clients have programs that support the large zip files? I've definitely run into issues with these before, especially on old Solaris servers
Download multiple files with a single action
HTTP does not support more than one file download at once.
There are two solutions:
- Open x amount of windows to initiate the file downloads (this would be done with JavaScript)
- preferred solution create a script to zip the files
Generating ZIP files in azure blob storage
There are few ways you can do this **from azure-batch only point of view**
: (for the initial part user code should own whatever zip api they use to zip their files but once it is in blob and user want to use in the nodes then there are options mentioned below.)
For initial part of your question I found this which could come handy: https://microsoft.github.io/AzureTipsAndTricks/blog/tip141.html (but this is mainly from idea sake and you will know better + need to design you solution space accordingly)
In option 1 and 3 below you need to make sure you user code handle the unzip or unpacking the zip file. Option 2 is the batch built-in feature for *.zip
file both at pool and task level.
Option 1: You could have your
*rar or *zip
file added asazure batch resource files
and then unzip them at the start task level, once resource file is downloaded. Azure Batch Pool Start up task to download resource file from Blob FileShareOption 2: The best opiton if you have
zip
but not rar file in the play is this feature namedAzure batch applicaiton package
link here : https://learn.microsoft.com/en-us/azure/batch/batch-application-packages
The application packages feature of Azure Batch provides easy
management of task applications and their deployment to the compute
nodes in your pool. With application packages, you can upload and
manage multiple versions of the applications your tasks run, including
their supporting files. You can then automatically deploy one or more
of these applications to the compute nodes in your pool.
- https://learn.microsoft.com/en-us/azure/batch/batch-application-packages#application-packages
An application package is a .zip file that contains the application binaries and supporting files that are required for your
tasks to run the application. Each application package represents a
specific version of the application.
With regards to the size: refer to the max allowed in blob link in the document above.
Option 3: (Not sure if this will fit your scenario) Long shot for your specific scenario but you could also mount virtual blob to the drive at join pool via mount feature in azure batch and you need to write code at start task or some thing to unzip from the mounted location.
Hope this helps :)
best way to download large files from azure cloud storage
Your assumption is correct, if you want to use the ActionResult you would need to download the file to the web role first and then stream it down to the client. If you can you want to avoid this particularly with large files and leave it up to Azure Storage because then Microsoft has to worry about dealing with the request, you don't have to pay for more web roles if you get lots of traffic.
This works well if all of the files you're hosting are public, but gets a little trickier if you want to secure the files (look into shared access signatures if that it what you want to do).
Have you tried setting the content type on the blob? Depending on how you've uploaded the files to blob storage they may not be set. If you're uploading the blobs through your own code you can access this through CloudBlob.Attributes.Properties.ContentType
(from MSDN)
How to download a zip file in a struts2 application
You can't write in the client File System like that; in your case, your server is in your machine, but don't get fooled, it's a server path, not a client one. You need to write on the response.
You can't use both Struts2 result and writing in the OutputStream together: when manually forging the response, you must bypass the framework convention, and return the result by yourself. The correct result for this case is is Action.NONE
:
<action name="Download" class="com.cdac.action.DownloadAction" />
public String execute(){
/*
do your stuff
*/
return NONE;
}
You're lucky, here is a kick off example I've written long time ago, explaining the whole thing (including the need to deal with duplicate filenames in the same ZIP).
Also try using Content-Length
properly (to let the browser draw a realistic progress-bar).
Create Zip archive from multiple in memory files in C#
Use ZipEntry
and PutNextEntry()
for this. The following shows how to do it for a file, but for an in-memory object just use a MemoryStream
FileStream fZip = File.Create(compressedOutputFile);
ZipOutputStream zipOStream = new ZipOutputStream(fZip);
foreach (FileInfo fi in allfiles)
{
ZipEntry entry = new ZipEntry((fi.Name));
zipOStream.PutNextEntry(entry);
FileStream fs = File.OpenRead(fi.FullName);
try
{
byte[] transferBuffer[1024];
do
{
bytesRead = fs.Read(transferBuffer, 0, transferBuffer.Length);
zipOStream.Write(transferBuffer, 0, bytesRead);
}
while (bytesRead > 0);
}
finally
{
fs.Close();
}
}
zipOStream.Finish();
zipOStream.Close();
Java multithreaded file downloading performance
To answer my own questions:
- The increased CPU usage was due to a
while() {}
loop that was waiting for the threads to finish. As it turns out,awaitTermination
is a much better alternative to wait for anExecutor
to finish :) - (And 3 and 4) This seems to be the nature of the beast; in the end I achieved what I wanted to do by using careful synchronization of the different threads that each download a chunk of data (well, in particular the writes of these chunks back to disk).
Can I do direct streaming of ZipFile in vaadin?
The Java JDK and Apache commons-compress don't let you stream ZIP archives lazily, so I implemented a Java ZIP library [1] to handle that. The current limitation is it doesn't support ZIP64 extensions, it means it can't compress files bigger than 4 GiB and can't produce archives bigger than 4 GiB. I'm working on that.
[1] https://github.com/tsabirgaliev/zip
Related Topics
How Can "This" of the Outer Class Be Accessed from an Inner Class
Rejectedexecutionexception Inside Single Executor Service
How to Insert a Row Between Two Rows in an Existing Excel with Hssf (Apache Poi)
Pros and Cons of Package Private Classes in Java
Java: How to Use Urlconnection to Post Request with Authorization
Jackson - How to Process (Deserialize) Nested JSON
How to Use Bufferedreader in Java
Apache Httpclient 4.0.3 - How to Set Cookie with Sessionid for Post Request
Java Erasure with Generic Overloading (Not Overriding)
Jaxb: How to Marshal Complex Nested Data Structures
How to Iterate Over a Priorityqueue
Scanner Only Reads File Name and Nothing Else