Python - When to use file vs open
You should always use open()
.
As the documentation states:
When opening a file, it's preferable
to use open() instead of invoking this
constructor directly. file is more
suited to type testing (for example,
writing "isinstance(f, file)").
Also, file()
has been removed since Python 3.0.
File read using open() vs with open()
Using with
statement is not for performance gain, I do not think there are any performance gains or loss associated with using with
statement, as long as, you perform the same cleanup activity that using with
statement would perform automatically.
When you use with
statement with open
function, you do not need to close the file at the end, because with
would automatically close it for you.
Also, with
statement is not just for openning files, with is used in conjuction with context managers. Basically, if you have an object that you want to make sure it is cleaned once you are done with it or some kind of errors occur, you can define it as a context manager and with
statement will call its __enter__()
and __exit__()
methods on entry to and exit from the with block. According to PEP 0343 -
This PEP adds a new statement "
with
" to the Python language to make it possible to factor out standard uses of try/finally statements.In this PEP, context managers provide
__enter__()
and__exit__()
methods that are invoked on entry to and exit from the body of the with statement.
Also, performance testing of using with
and not using it -
In [14]: def foo():
....: f = open('a.txt','r')
....: for l in f:
....: pass
....: f.close()
....:
In [15]: def foo1():
....: with open('a.txt','r') as f:
....: for l in f:
....: pass
....:
In [17]: %timeit foo()
The slowest run took 41.91 times longer than the fastest. This could mean that an intermediate result is being cached
10000 loops, best of 3: 186 µs per loop
In [18]: %timeit foo1()
The slowest run took 206.14 times longer than the fastest. This could mean that an intermediate result is being cached
10000 loops, best of 3: 179 µs per loop
In [19]: %timeit foo()
The slowest run took 202.51 times longer than the fastest. This could mean that an intermediate result is being cached
10000 loops, best of 3: 180 µs per loop
In [20]: %timeit foo1()
10000 loops, best of 3: 193 µs per loop
In [21]: %timeit foo1()
10000 loops, best of 3: 194 µs per loop
When should I ever use file.read() or file.readlines()?
The short answer to your question is that each of these three methods of reading bits of a file have different use cases. As noted above, f.read()
reads the file as an individual string, and so allows relatively easy file-wide manipulations, such as a file-wide regex search or substitution.
f.readline()
reads a single line of the file, allowing the user to parse a single line without necessarily reading the entire file. Using f.readline()
also allows easier application of logic in reading the file than a complete line by line iteration, such as when a file changes format partway through.
Using the syntax for line in f:
allows the user to iterate over the file line by line as noted in the question.
(As noted in the other answer, this documentation is a very good read):
https://docs.python.org/3/tutorial/inputoutput.html#methods-of-file-objects
Note:
It was previously claimed that f.readline()
could be used to skip a line during a for loop iteration. However, this doesn't work in Python 2.7, and is perhaps a questionable practice, so this claim has been removed.
Python, difference between 'open' and 'with open'
This error was caused by a previous version of the posted script. It looked like this:
if not(os.path.exists('teams')):
os.makedirs('mydir')
This tests for the existence of the directory teams
but tries to create a new directory mydir
.
Suggested solution: use variable names for everything, don't hardwire strings for paths:
path = 'mydir'
if not(os.path.exists(path)):
os.makedirs(path)
And yes, both #1
and #2
do essentially the same. But the with
statement also closes the file in case of an exception during writing.
Why is open() preferable over file() in Python?
The Zen of Python:
There should be one-- and preferably only one --obvious way to do it.
So either file
or open
should go.
>>> type(file)
<type 'type'>
>>> type(open)
<type 'builtin_function_or_method'>
open
is a function that can return anything. file()
returns only file
objects.
Though it seems open
returns only file
objects on Python 2. And before Python 2.5 file
and open
are the same object.
As @gnibbler suggested in the comments the original reason for the existence of file
might be to use it as the name for base classes.
Also, file()
in principle could return other types as for example int()
did on earlier Python versions:
>>> type(int(2**64)) is long
True
>>> type(int()) is int
True
>>> int is long
False
This answer is very similar to @Ryan's answer.
In addition BDFL said:
"The file class is new in Python 2.2. It represents the type (class)
of objects returned by the built-in open() function. Its constructor
is an alias for open(), but for future and backwards compatibility,
open() remains preferred." (emphasis mine)
What's the difference between 'r+' and 'a+' when open file in python?
Python opens files almost in the same way as in C:
r+
Open for reading and writing. The stream is positioned at the beginning of the file.a+
Open for reading and appending (writing at end of file). The file is created if it does not exist. The initial file position for reading is at the beginning of the file, but output is appended to the end of the file (but in some Unix systems regardless of the current seek position).
Difference between io.open vs open in python
Situation in Python3 according to the docs:
io.open(file, *[options]*)
This is an alias for the builtin open() function.
and
While the builtin open() and the associated io module are the
recommended approach for working with encoded text files, this module
[i.e. codecs] provides additional utility functions and classes that
allow the use of a wider range of codecs when working with binary
files
(bold and italics are my edits)
Related Topics
How to Read Contents of an Table in Ms-Word File Using Python
Lambda Function Don't Closure the Parameter in Python
Anaconda Python: Where Are the Virtual Environments Stored
Add Zeros to a Float After the Decimal Point in Python
Why Do Two Identical Lists Have a Different Memory Footprint
Timedelta to String Type in Pandas Dataframe
Curses Alternative for Windows
How to Add a Timeout to a Function in Python
Pandas Style Function to Highlight Specific Columns
Get Files Names Inside a Zip File on Ftp Server Without Downloading Whole Archive
Using the Multiprocessing Module for Cluster Computing
Python Float to Int Conversion
Pandas: Valueerror: Cannot Convert Float Nan to Integer
Cx_Freeze Crashing Python 3.7.0
Ioerror: [Errno 22] Invalid Mode ('R') or Filename: 'C:\\Python27\Test.Txt'