Python - When to Use File VS Open

Python - When to use file vs open

You should always use open().

As the documentation states:

When opening a file, it's preferable
to use open() instead of invoking this
constructor directly. file is more
suited to type testing (for example,
writing "isinstance(f, file)").

Also, file() has been removed since Python 3.0.

File read using open() vs with open()

Using with statement is not for performance gain, I do not think there are any performance gains or loss associated with using with statement, as long as, you perform the same cleanup activity that using with statement would perform automatically.

When you use with statement with open function, you do not need to close the file at the end, because with would automatically close it for you.

Also, with statement is not just for openning files, with is used in conjuction with context managers. Basically, if you have an object that you want to make sure it is cleaned once you are done with it or some kind of errors occur, you can define it as a context manager and with statement will call its __enter__() and __exit__() methods on entry to and exit from the with block. According to PEP 0343 -

This PEP adds a new statement "with" to the Python language to make it possible to factor out standard uses of try/finally statements.

In this PEP, context managers provide __enter__() and __exit__() methods that are invoked on entry to and exit from the body of the with statement.

Also, performance testing of using with and not using it -

In [14]: def foo():
....: f = open('a.txt','r')
....: for l in f:
....: pass
....: f.close()
....:

In [15]: def foo1():
....: with open('a.txt','r') as f:
....: for l in f:
....: pass
....:

In [17]: %timeit foo()
The slowest run took 41.91 times longer than the fastest. This could mean that an intermediate result is being cached
10000 loops, best of 3: 186 µs per loop

In [18]: %timeit foo1()
The slowest run took 206.14 times longer than the fastest. This could mean that an intermediate result is being cached
10000 loops, best of 3: 179 µs per loop

In [19]: %timeit foo()
The slowest run took 202.51 times longer than the fastest. This could mean that an intermediate result is being cached
10000 loops, best of 3: 180 µs per loop

In [20]: %timeit foo1()
10000 loops, best of 3: 193 µs per loop

In [21]: %timeit foo1()
10000 loops, best of 3: 194 µs per loop

When should I ever use file.read() or file.readlines()?

The short answer to your question is that each of these three methods of reading bits of a file have different use cases. As noted above, f.read() reads the file as an individual string, and so allows relatively easy file-wide manipulations, such as a file-wide regex search or substitution.

f.readline() reads a single line of the file, allowing the user to parse a single line without necessarily reading the entire file. Using f.readline() also allows easier application of logic in reading the file than a complete line by line iteration, such as when a file changes format partway through.

Using the syntax for line in f: allows the user to iterate over the file line by line as noted in the question.

(As noted in the other answer, this documentation is a very good read):

https://docs.python.org/3/tutorial/inputoutput.html#methods-of-file-objects

Note:
It was previously claimed that f.readline() could be used to skip a line during a for loop iteration. However, this doesn't work in Python 2.7, and is perhaps a questionable practice, so this claim has been removed.

Python, difference between 'open' and 'with open'

This error was caused by a previous version of the posted script. It looked like this:

if not(os.path.exists('teams')):
os.makedirs('mydir')

This tests for the existence of the directory teams but tries to create a new directory mydir.

Suggested solution: use variable names for everything, don't hardwire strings for paths:

path = 'mydir'

if not(os.path.exists(path)):
os.makedirs(path)

And yes, both #1 and #2 do essentially the same. But the with statement also closes the file in case of an exception during writing.

Why is open() preferable over file() in Python?

The Zen of Python:

There should be one-- and preferably only one --obvious way to do it.

So either file or open should go.

>>> type(file)
<type 'type'>
>>> type(open)
<type 'builtin_function_or_method'>

open is a function that can return anything. file() returns only file objects.

Though it seems open returns only file objects on Python 2. And before Python 2.5 file and open are the same object.

As @gnibbler suggested in the comments the original reason for the existence of file might be to use it as the name for base classes.

Also, file() in principle could return other types as for example int() did on earlier Python versions:

>>> type(int(2**64)) is long
True
>>> type(int()) is int
True
>>> int is long
False

This answer is very similar to @Ryan's answer.

In addition BDFL said:

"The file class is new in Python 2.2. It represents the type (class)
of objects returned by the built-in open() function. Its constructor
is an alias for open(), but for future and backwards compatibility,
open() remains preferred." (emphasis mine)

What's the difference between 'r+' and 'a+' when open file in python?

Python opens files almost in the same way as in C:

  • r+ Open for reading and writing. The stream is positioned at the beginning of the file.

  • a+ Open for reading and appending (writing at end of file). The file is created if it does not exist. The initial file position for reading is at the beginning of the file, but output is appended to the end of the file (but in some Unix systems regardless of the current seek position).

Difference between io.open vs open in python

Situation in Python3 according to the docs:

io.open(file, *[options]*)

This is an alias for the builtin open() function.

and

While the builtin open() and the associated io module are the
recommended approach
for working with encoded text files, this module
[i.e. codecs] provides additional utility functions and classes that
allow the use of a wider range of codecs when working with binary
files

(bold and italics are my edits)



Related Topics



Leave a reply



Submit