Why Is Files.Lines (And Similar Streams) Not Automatically Closed

Why is Files.lines (and similar Streams) not automatically closed?

Yes, this was a deliberate decision. We considered both alternatives.

The operating design principle here is "whoever acquires the resource should release the resource". Files don't auto-close when you read to EOF; we expect files to be closed explicitly by whoever opened them. Streams that are backed by IO resources are the same.

Fortunately, the language provides a mechanism for automating this for you: try-with-resources. Because Stream implements AutoCloseable, you can do:

try (Stream<String> s = Files.lines(...)) {
s.forEach(...);
}

The argument that "it would be really convenient to auto-close so I could write it as a one-liner" is nice, but would mostly be the tail wagging the dog. If you opened a file or other resource, you should also be prepared to close it. Effective and consistent resource management trumps "I want to write this in one line", and we chose not to distort the design just to preserve the one-line-ness.

Files.lines : close stream

Files.lines implements AutoCloseable, so you can use try-catch with resources.

try(Stream<String> fileLines = Files.lines(fileP, Charset.forName("UTF-8"))){

Map<String, Long> wordCounts = fileLines.map(line -> line.replaceAll("[.?!]$"," % "))
.flatMap(line -> pattern.splitAsStream(line))
.collect(Collectors.groupingBy(String::toLowerCase,
TreeMap::new, Collectors.counting()));

}catch(Exception e){
e.printStackTrace();
}

After try block is executed, it automatically call fileLines.close()

P.S.: Read answer of rzwitserloot, it contains useful information.

Closing mapped streams - what's the idea?

The general rule about resource handling is that whoever is responsible for closing a resource is the one that opened it. The flatMap operation is the only operation in the Stream API that opens a Stream, so it is the only operation that will close it.

Quoting from this mail, Brian Goetz said:

To summarize, flatMap() is the only operation that internally closes the
stream after its done, and for good reason -- it is the only case where
the stream is effectively opened by the operation itself, and therefore
should be closed by the operation too. Any other streams are assumed to
be opened by the caller, and therefore should be closed by the caller.

The example given is the following. Consider

try (Stream<Path> paths = Files.walk(dir)) {
Stream<String> stream = paths.flatMap(p -> {
try {
return Files.lines(p);
} catch (IOException e) {
throw new UncheckedIOException(e);
}
});
}

The method reference Files::lines returns a Stream<String> of the lines of the file. When the flat mapping operation is over, it is expected that the opened resource used to read the file is closed. The question is: closed by what? Well, closed by flatMap itself because it is the operation that opened the Stream in the first place.

Files.lines returns a Stream with a pre-registered close handler that closes the underlying BufferedReader. When the flatMap operation is done, this close handler is invoked and the resources are correctly released.


The reason this idea is backported to flatMapTo* operations is the same: adhering to the above rule that every resource allocated by a process should be closed by that process.

Just to show that you can build an IntStream which would have an underlying resource to close, consider the following Stream pipeline where each path is not flatmapped to its lines but to the number of character in each line.

try (Stream<Path> paths = Files.walk(dir)) {
IntStream stream = paths.flatMapToInt(p -> {
try {
return Files.lines(p).mapToInt(String::length);
} catch (IOException e) {
throw new UncheckedIOException(e);
}
});
}

Resource leak in Files.list(Path dir) when stream is not explicitly closed?

If you close the Stream, Files.list() does close the underlying DirectoryStream it uses to stream the files, so there should be no resource leak as long as you close the Stream.

You can see where the DirectoryStream is closed in the source code for Files.list() here:

return StreamSupport.stream(Spliterators.spliteratorUnknownSize(it, Spliterator.DISTINCT), false)
.onClose(asUncheckedRunnable(ds));

The key thing to understand is that a Runnable is registered with the Stream using Stream::onClose that is called when the stream itself is closed. That Runnable is created by a factory method, asUncheckedRunnable that creates a Runnable that closes the resource passed into it, translating any IOException thrown during the close() to an UncheckedIOException

You can safely assure that the DirectoryStream is closed by ensuring the Stream is closed like this:

try (Stream<Path> files = Files.list(Paths.get(destination))){
files.forEach(path -> {
// Do stuff
});
}

When does a stream close if its not closed manually?

I don't think that the JVM spec makes any guarantee about that. You really are supposed to finally close these resources.

When the process ends, the operating system will release all resources associated to it (including memory, file handles, and network sockets).

There are OS facilities to check about open files and streams, such as lsof.

Are streams closed automatically on error?

The OS itself might close the streams and deallocate resources because the process (namely, the JVM) terminates, but it is not mandated to do so.

You should always implement a finally block where you close it in cases like these, e.g. like this:

InputStream is = null;

try {
is = new FileInputStream(new File("lolwtf"));
//read stuff here
} catch (IOException e) {
System.out.println("omfg, it didn't work");
} finally {
is.close();
}

This isn't really guaranteed to work if it threw in the first place, but you'll probably wanna terminate at that point anyway since your data source is probably messed up in some way. You can find out more info about it if you keep the InputStream's provider around, like, if I kept a ref to the File object around in my example, I could check whether it exists etc via File's interface, but that's specific to your particular data provider.

This tactic gets more useful with network sessions that throw, e.g., with Hibernate...

How to close a java.util.Stream and use a terminal operation

See docs.

8

If timely disposal of file system resources is required, the try-with-resources construct should be used to ensure that the stream's close method is invoked after the stream operations are completed.

12

This method must be used within a try-with-resources statement or similar control structure to ensure that the stream's open file is closed promptly after the stream's operations have completed.

I think you might want to consider using a try-with-resources construct. Especially as the current one has evolved into "must".

Java8 Stream of files, how to control the closing of files?

A general note on the use of FileReader: FileReader uses internally a FileInputStream which overrides finalize() and is therefore discouraged to use beacause of the impact it has on garbarge collection especially when dealing with lots of files.

Unless you're using a Java version prior to Java 7 you should use the java.nio.files API instead, creating a BufferedReader with

 Path path = Paths.get(filename);
BufferedReader br = Files.newBufferedReader(path);

So the beginning of your stream pipeline should look more like

 filenames.map(Paths::get)
.filter(Files::exists)
.map(p -> {
try {
return Optional.of(Files.newBufferedReader(p));
} catch (IOException e) {
return Optional.empty();
}
})

Now to your problem:

Option 1

One way to preserve the original Reader would be to use a Tuple. A tuple (or any n-ary variation of it) is generally a good way to handle multiple results of a function application, as it's done in a stream pipeline:

class ReaderTuple<T> {
final Reader first;
final T second;
ReaderTuple(Reader r, T s){
first = r;
second = s;
}
}

Now you can map the FileReader to a Tuple with the second item being your current stream item:

 filenames.map(Paths::get)
.filter(Files::exists)
.map(p -> {
try {
return Optional.of(Files.newBufferedReader(p));
} catch (IOException e) {
return Optional.empty();
}
})
.filter(Optional::isPresent)
.map(Optional::get)
.flatMap(r -> new ReaderTuple(r, yourOtherItem))
....
.peek(rt -> {
try {
rt.first.close() //close the reader or use a try-with-resources
} catch(Exception e){}
})
...

Problem with that approach is, that whenever an unchecked exception occurrs during stream execution betweem the flatMap and the peek, the readers might not be closed.

Option 2

An alternative to use a tuple is to put the code that requires the reader in a try-with-resources block. This approach has the advantage that you're in control to close all readers.

Example 1:

 filenames.map(Paths::get)
.filter(Files::exists)
.map(p -> {
try (Reader r = new BufferedReader(new FileReader(p))){

Stream.of(r)
.... //put here your stream code that uses the stream

} catch (IOException e) {
return Optional.empty();
}
}) //reader is implicitly closed here
.... //terminal operation here

Example 2:

filenames.map(Paths::get)
.filter(Files::exists)
.map(p -> {
try {
return Optional.of(Files.newBufferedReader(p));
} catch (IOException e) {
return Optional.empty();
}
})
.filter(Optional::isPresent)
.map(Optional::get)
.flatMap(reader -> {
try(Reader r = reader) {

//read from your reader here and return the items to flatten

} //reader is implicitly closed here
})

Example 1 has the advantage that the reader gets certainly closed. Example 2 is safe unless you put something more between the the creation of the reader and the try-with-resources block that may fail.

I personally would go for Example 1, and put the code that is accessing the reader in a separate function so the code is better readable.

Does closing a Stream close the BufferedReader source?

No, seems it doesn't. As the stream is created using

    return StreamSupport.stream(Spliterators.spliteratorUnknownSize(
iter, Spliterator.ORDERED | Spliterator.NONNULL), false);

which doesn't pass any reference to the the BufferedReader



Related Topics



Leave a reply



Submit