Is Close() Necessary When Using Iterator on a Python File Object

Does a File Object Automatically Close when its Reference Count Hits Zero?

The answer is in the link you provided.

Garbage collector will close file when it destroys file object, but:

  • you don't really have control over when it happens.

    While CPython uses reference counting to deterministically release resources
    (so you can predict when object will be destroyed) other versions don't have to.
    For example both Jython or IronPython use JVM and .NET garbage collector which
    release (and finalize) objects only when there is need to recover memory
    and might not do that for some object until the end of the program.
    And even for CPython GC algorithm may change in the future as reference counting
    isn't very efficient.

  • if exception is thrown when closing file on file object destruction,
    you can't really do anything about it because you won't know.

Why should I close files in Python?

For the most part, not closing files is a bad idea, for the following reasons:

  1. It puts your program in the garbage collectors hands - though the file in theory will be auto closed, it may not be closed. Python 3 and Cpython generally do a pretty good job at garbage collecting, but not always, and other variants generally suck at it.

  2. It can slow down your program. Too many things open, and thus more used space in the RAM, will impact performance.

  3. For the most part, many changes to files in python do not go into effect until after the file is closed, so if your script edits, leaves open, and reads a file, it won't see the edits.

  4. You could, theoretically, run in to limits of how many files you can have open.

  5. As @sai stated below, Windows treats open files as locked, so legit things like AV scanners or other python scripts can't read the file.

  6. It is sloppy programming (then again, I'm not exactly the best at remembering to close files myself!)

does this line of python close the file when its finished?

The file may be closed for you implicitly by the garbage collector. There is more to it:

  • Is explicitly closing files important?
  • Is close() necessary when using iterator on a Python file object

It's recommended to use with context manager when dealing with files or file-like objects:

It is good practice to use the with keyword when dealing with file
objects. This has the advantage that the file is properly closed after
its suite finishes, even if an exception is raised on the way. It is
also much shorter than writing equivalent try-finally blocks

with open(sFile) as input_file:
lines = input_file.read().split("0d".decode('hex'))

Do files automatically close if I don't assign them to a variable?

Files will close when the corresponding object is deallocated. The sample you give depends on that; there is no reference to the object, so the object will be removed and the file will be closed.

Important to note is that there isn't a guarantee made as to when the object will be removed. With CPython, you have reference counting as the basis of memory management, so you would expect the file to close immediately. In, say, Jython, the garbage collector is not guaranteed to run at any particular time (or even at all), so you shouldn't count on the file being closed and should instead close the file manually or (better) use a with statement.

How does python close files that have been gc'ed?

In CPython, at least, files are closed when the file object is deallocated. See the file_dealloc function in Objects/fileobject.c in the CPython source. Dealloc methods are sort-of like __del__ for C types, except without some of the problems inherent to __del__.

Is the File-Object iterator broken?

I think this is, if anything, a docs bug on that paragraph, not a bug in io objects. (And io object’s aren’t the only thing—most trivially, a csv.reader wrapper around a file is just as restartable as a file.)

If you just use an iterator as an iterator, once it raises it will keep on raising. But if you call methods outside of the iterator protocol, you’re not really using it as an iterator anymore, but as something more than an iterator. And in that case, it seems legal and even idiomatic for the object to be “refillable” if it makes sense. As long as it never refills itself while it’s quacking as an iterator, only when it’s quacking as some other type that goes beyond that.

In a similar situation in C++, the language committee might well declare that this breaks substitutability and therefore the iterator becomes invalid as an iterator once you call such a method on it, even if the language can’t enforce that. Or come up with a whole new protocol for refillable iterators. (Of course C++ iterators aren’t quite the same thing as Python iterators, but hopefully you get what I mean.)

But in Python, practicality beats purity. I’m pretty sure Guido intended this behavior from the start, and that an object is allowed to do this and still be considered an iterator, and the core devs continue to intend it, and it’s just that nobody has thought about how to write something sufficiently rigorous to explain it accurately because nobody has asked.

If you ask by filing a docs bug, I’ll bet that this paragraph gets a footnote, rather than the io and other refillable iterator objects being reclassified as not actually iterators.



Related Topics



Leave a reply



Submit