How to Download and Save a File from the Internet Using Java

How can I download and save a file from the Internet using Java?

Give Java NIO a try:

URL website = new URL("http://www.website.com/information.asp");
ReadableByteChannel rbc = Channels.newChannel(website.openStream());
FileOutputStream fos = new FileOutputStream("information.html");
fos.getChannel().transferFrom(rbc, 0, Long.MAX_VALUE);

Using transferFrom() is potentially much more efficient than a simple loop that reads from the source channel and writes to this channel. Many operating systems can transfer bytes directly from the source channel into the filesystem cache without actually copying them.

Check more about it here.

Note: The third parameter in transferFrom is the maximum number of bytes to transfer. Integer.MAX_VALUE will transfer at most 2^31 bytes, Long.MAX_VALUE will allow at most 2^63 bytes (larger than any file in existence).

Downloading file from internet and saving in specific folder using Java

You can just give the relative or absolute file path when specifying the file name in the FileOutputStream.

FileOutputStream fos = new FileOutputStream("/user/home/Desktop/Download/myFile.extn");

How to download file from internet in java

That's because you haven't given the file a name, and writing to a file with no name makes no sense.

File file = new File("");  

If you replace that line with something like:

File file = new File("x.png");

...then it should work.

How to download a file from a website?

In answer to your first question, you put the filename of where you want to save the file. See the docs here.. And I think that answers your second question, since the string is where you want to save the file.

Just remember if you use a relative path, the file will save where the application is executed, and you also need to make sure you have write access to that directory.

Cannot download file from URL in java

You are losing every alternate bytedue to

    while (fileIn.read() != -1) {     //1st read
fileOut.write(fileIn.read()); //2nd read - 1st write
}

You are reading twice and writing only once.

What you need to do is

    int x;
while ((x = fileIn.read()) != -1) { //1st read
fileOut.write(x); //1st write
}

Here is your complete code

import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.net.URL;
import java.net.URLConnection;
import java.util.Scanner;

public class FileDownloader {

public static void main(String[] args) throws IOException {

InputStream fileIn;
FileOutputStream fileOut;
Scanner s = new Scanner(System.in);

System.out.println("Enter URL: ");
String urlStr = s.nextLine();

URL url = new URL(urlStr);
URLConnection urlConnect = url.openConnection();
fileIn = urlConnect.getInputStream();

System.out.println("Enter file name: ");
String fileStr = s.nextLine();
fileOut = new FileOutputStream(fileStr);

int x;
while ((x = fileIn.read()) != -1) {
fileOut.write(x);
}
System.out.println("File is downloaded");

}

Download a file from internet without using RAM and show a progress bar

Download a file from internet without using RAM

This is a hazy specification. Any download will temporariely use at least tiny amounts of memory. This is almost unavoidable, unless you find a way to directly stream bytes from the network card to a file storage like a hard drive. However, even those usually have an internal memory to store data, before write operations can be executed.

If you want to avoid using a lot of RAM space, then streaming a file is the correct way to do it. The BufferedReader you are using does not store the entire file to RAM memory, but only the portion that has already arrived, but not yet written to the target. So unless your output (hard drive) is blocking, or slow, the memory usage will be low.

If however your output channel is slow or blocking, then your memory might run up.

How to download a file using a Java REST service and a data stream

"How can I directly (without saving the file on 2nd server) download the file from 1st server to client's machine?"

Just use the Client API and get the InputStream from the response

Client client = ClientBuilder.newClient();
String url = "...";
final InputStream responseStream = client.target(url).request().get(InputStream.class);

There are two flavors to get the InputStream. You can also use

Response response = client.target(url).request().get();
InputStream is = (InputStream)response.getEntity();

Which one is the more efficient? I'm not sure, but the returned InputStreams are different classes, so you may want to look into that if you care to.

From 2nd server I can get a ByteArrayOutputStream to get the file from 1st server, can I pass this stream further to the client using the REST service?

So most of the answers you'll see in the link provided by @GradyGCooper seem to favor the use of StreamingOutput. An example implementation might be something like

final InputStream responseStream = client.target(url).request().get(InputStream.class);
System.out.println(responseStream.getClass());
StreamingOutput output = new StreamingOutput() {
@Override
public void write(OutputStream out) throws IOException, WebApplicationException {
int length;
byte[] buffer = new byte[1024];
while((length = responseStream.read(buffer)) != -1) {
out.write(buffer, 0, length);
}
out.flush();
responseStream.close();
}
};
return Response.ok(output).header(
"Content-Disposition", "attachment, filename=\"...\"").build();

But if we look at the source code for StreamingOutputProvider, you'll see in the writeTo, that it simply writes the data from one stream to another. So with our implementation above, we have to write twice.

How can we get only one write? Simple return the InputStream as the Response

final InputStream responseStream = client.target(url).request().get(InputStream.class);
return Response.ok(responseStream).header(
"Content-Disposition", "attachment, filename=\"...\"").build();

If we look at the source code for InputStreamProvider, it simply delegates to ReadWriter.writeTo(in, out), which simply does what we did above in the StreamingOutput implementation

 public static void writeTo(InputStream in, OutputStream out) throws IOException {
int read;
final byte[] data = new byte[BUFFER_SIZE];
while ((read = in.read(data)) != -1) {
out.write(data, 0, read);
}
}

Asides:

  • Client objects are expensive resources. You may want to reuse the same Client for request. You can extract a WebTarget from the client for each request.

    WebTarget target = client.target(url);
    InputStream is = target.request().get(InputStream.class);

    I think the WebTarget can even be shared. I can't find anything in the Jersey 2.x documentation (only because it is a larger document, and I'm too lazy to scan through it right now :-), but in the Jersey 1.x documentation, it says the Client and WebResource (which is equivalent to WebTarget in 2.x) can be shared between threads. So I'm guessing Jersey 2.x would be the same. but you may want to confirm for yourself.

  • You don't have to make use of the Client API. A download can be easily achieved with the java.net package APIs. But since you're already using Jersey, it doesn't hurt to use its APIs

  • The above is assuming Jersey 2.x. For Jersey 1.x, a simple Google search should get you a bunch of hits for working with the API (or the documentation I linked to above)


UPDATE

I'm such a dufus. While the OP and I are contemplating ways to turn a ByteArrayOutputStream to an InputStream, I missed the simplest solution, which is simply to write a MessageBodyWriter for the ByteArrayOutputStream

import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.lang.annotation.Annotation;
import java.lang.reflect.Type;
import javax.ws.rs.WebApplicationException;
import javax.ws.rs.core.MediaType;
import javax.ws.rs.core.MultivaluedMap;
import javax.ws.rs.ext.MessageBodyWriter;
import javax.ws.rs.ext.Provider;

@Provider
public class OutputStreamWriter implements MessageBodyWriter<ByteArrayOutputStream> {

@Override
public boolean isWriteable(Class<?> type, Type genericType,
Annotation[] annotations, MediaType mediaType) {
return ByteArrayOutputStream.class == type;
}

@Override
public long getSize(ByteArrayOutputStream t, Class<?> type, Type genericType,
Annotation[] annotations, MediaType mediaType) {
return -1;
}

@Override
public void writeTo(ByteArrayOutputStream t, Class<?> type, Type genericType,
Annotation[] annotations, MediaType mediaType,
MultivaluedMap<String, Object> httpHeaders, OutputStream entityStream)
throws IOException, WebApplicationException {
t.writeTo(entityStream);
}
}

Then we can simply return the ByteArrayOutputStream in the response

return Response.ok(baos).build();

D'OH!

UPDATE 2

Here are the tests I used (

Resource class

@Path("test")
public class TestResource {

final String path = "some_150_mb_file";

@GET
@Produces(MediaType.APPLICATION_OCTET_STREAM)
public Response doTest() throws Exception {
InputStream is = new FileInputStream(path);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
int len;
byte[] buffer = new byte[4096];
while ((len = is.read(buffer, 0, buffer.length)) != -1) {
baos.write(buffer, 0, len);
}
System.out.println("Server size: " + baos.size());
return Response.ok(baos).build();
}
}

Client test

public class Main {
public static void main(String[] args) throws Exception {
Client client = ClientBuilder.newClient();
String url = "http://localhost:8080/api/test";
Response response = client.target(url).request().get();
String location = "some_location";
FileOutputStream out = new FileOutputStream(location);
InputStream is = (InputStream)response.getEntity();
int len = 0;
byte[] buffer = new byte[4096];
while((len = is.read(buffer)) != -1) {
out.write(buffer, 0, len);
}
out.flush();
out.close();
is.close();
}
}

UPDATE 3

So the final solution for this particular use case was for the OP to simply pass the OutputStream from the StreamingOutput's write method. Seems the third-party API, required a OutputStream as an argument.

StreamingOutput output = new StreamingOutput() {
@Override
public void write(OutputStream out) {
thirdPartyApi.downloadFile(.., .., .., out);
}
}
return Response.ok(output).build();

Not quite sure, but seems the reading/writing within the resource method, using ByteArrayOutputStream`, realized something into memory.

The point of the downloadFile method accepting an OutputStream is so that it can write the result directly to the OutputStream provided. For instance a FileOutputStream, if you wrote it to file, while the download is coming in, it would get directly streamed to the file.

It's not meant for us to keep a reference to the OutputStream, as you were trying to do with the baos, which is where the memory realization comes in.

So with the way that works, we are writing directly to the response stream provided for us. The method write doesn't actually get called until the writeTo method (in the MessageBodyWriter), where the OutputStream is passed to it.

You can get a better picture looking at the MessageBodyWriter I wrote. Basically in the writeTo method, replace the ByteArrayOutputStream with StreamingOutput, then inside the method, call streamingOutput.write(entityStream). You can see the link I provided in the earlier part of the answer, where I link to the StreamingOutputProvider. This is exactly what happens



Related Topics



Leave a reply



Submit