How to Clone an Inputstream

How to clone an InputStream?

If all you want to do is read the same information more than once, and the input data is small enough to fit into memory, you can copy the data from your InputStream to a ByteArrayOutputStream.

Then you can obtain the associated array of bytes and open as many "cloned" ByteArrayInputStreams as you like.

ByteArrayOutputStream baos = new ByteArrayOutputStream();

// Code simulating the copy
// You could alternatively use NIO
// And please, unlike me, do something about the Exceptions :D
byte[] buffer = new byte[1024];
int len;
while ((len = input.read(buffer)) > -1 ) {
baos.write(buffer, 0, len);
}
baos.flush();

// Open new InputStreams using recorded bytes
// Can be repeated as many times as you wish
InputStream is1 = new ByteArrayInputStream(baos.toByteArray());
InputStream is2 = new ByteArrayInputStream(baos.toByteArray());

But if you really need to keep the original stream open to receive new data, then you will need to track the external call to close(). You will need to prevent close() from being called somehow.

UPDATE (2019):

Since Java 9 the the middle bits can be replaced with InputStream.transferTo:

ByteArrayOutputStream baos = new ByteArrayOutputStream();
input.transferTo(baos);
InputStream firstClone = new ByteArrayInputStream(baos.toByteArray());
InputStream secondClone = new ByteArrayInputStream(baos.toByteArray());

How to get copy of InputStream?

Use http://commons.apache.org/proper/commons-io/ Commons-IO to get a byte[] of the data in the InputStream

byte[] data = IOUtils.toByteArray(httpsURLConnection.getInputStream());

Then store the data in a file by writing it in an OutputStream, then do with the data whatever you want.

Clone InputStream

As already commented by Boris Spider, it is possible to read the whole stream e.g. to a byte array stream and then open new streams on that resource:

    byte[] byteArray = IOUtils.toByteArray(stream);     
InputStream input1 = new ByteArrayInputStream(byteArray);
InputStream input2 = new ByteArrayInputStream(byteArray);

Object body;
try {
ObjectInput ois = new ObjectInputStream(input1);
body = ois.readObject();
} catch (Exception e) {
try {
body = IOUtils.toString(input2, Charset.forName("UTF-8"));
} catch (Exception e2) {
throw new MarshalException("Could not convert inputStream");
}
}

How to make a deep copy of an InputStream in Java

InputStream is abstract and does not expose (neither do its children) internal data objects. So the only way to "deep copy" the InputStream is to create ByteArrayOutputStream and after doing read() on InputStream, write() this data to ByteArrayOutputStream. Then do:

newStream = new ByteArrayInputStream(byteArrayOutputStream.toArray());

If you are using mark() on your InputStream then indeed you can not reverse this. This makes your stream "consumed".

To "reuse" your InputStream avoid using mark() and then at the end of reading call reset(). You will be then reading from beginning of the stream.

Edited:

BTW, IOUtils uses this simple code snippet to copy InputStream:

public static int copy(InputStream input, OutputStream output) throws IOException{
byte[] buffer = new byte[DEFAULT_BUFFER_SIZE];
int count = 0;
int n = 0;
while (-1 != (n = input.read(buffer))) {
output.write(buffer, 0, n);
count += n;
}
return count;
}

Read more: http://kickjava.com/src/org/apache/commons/io/CopyUtils.java.htm#ixzz13ymaCX9m

How to clone input stream but still re-use original

ImputStreams are not generally cloneable, and neither do all streams support mark/reset. There are some possible workarounds within the standard JRE.

Wrap the InputStream into a BufferedInputStream. That one supports mark/reset within the limits of its buffer size. That enables you to read a limited amount of data from the beginning, then reset the stream.

Another alternative is PushBackInputStream, which allows you to "unread" data previously read. You need to buffer the data to be pushed back yourself though, so it may be a bit inconvinient to handle.

If the whole stream isn't terribly big, you could also read the entire stream first, then construct as many ByteArrayInputStreams as needed from the pre-read data. Only feasible if the data fits in the heap (e.g. less than approximately 2GB max).

How can I copy inputstream into two copies

You have multiple options.

  1. Files.newInputStream, hand that off to your hasher algorithm, obtain the hash, then start over to send. This is the best option if it is highly useful to have the hash available during/before the upload. It requires reading the bytes off of disk twice, of course.

  2. Use an existing implementation of a stream that hashes on the fly, such as guava's HashingInputStream, or write something like this on your own (it's not particularly difficult to do so).

You can't easily have 2 inputstreams that can both be fully streamed through whilst only causing the file to be read once, because the 'user' of an inputstream decides how 'fast' you go through, and you can't have 2 separate lines of code both be in charge.

Hence, one of the two processes needs to not be an inputstream and instead have its control reversed: Instead of allowing the code to ask the inputstream for more data (by calling one of its read methods), you'd have some code that is invoked by the inputstream with: Hey, I just read this data because the 'primary' driver asked for it, before I hand it off to the primary driver, anything you need to do here?

The hashing code should be this secondary driver, because it can trivially deal with 'here are X bytes but we are not done yet please process it'.

Here is an example of what that would look like. Note that FilterInputStream by default just forwards all calls directly to the stream you wrap.

public class HashingInputStream extends FilterInputStream {
private final MessageDigest hash;
public HashingInputStream(InputStream base) {
super(base);
hash = MessageDigest.getInstance("SHA-256");
}

@Override
public int read() throws IOException {
int v = super.read();
if (v == -1) return v;
hash.update((byte) v);
return v;
}

@Override
public int read(byte[] b) throws IOException {
int r = super.read(b);
if (r == -1) return r;
hash.update(b, 0, r);
return r;
}

@Override
public int read(byte[] b, int off, int len) throws IOException {
int r = super.read(b, off, len);
if (r == -1) return r;
hash.update(b, off, r);
return r;
}

public byte[] digest() {
return hash.digest();
}
}

your upload code wraps the inputstream, e.g:

try (var in = Files.newInputStream(pathToYourFile)) {
var hashing = new HashingInputStream(in);
hashing.transferTo(yourOutputStream);
var hash = hashing.digest();
}

And you'll get your hash at the end; the file is only read once.



Related Topics



Leave a reply



Submit