How Does Python Importing Exactly Work

How does Python importing exactly work?

Part 1

The module is only loaded once, so there is no performance loss by importing it again. If you actually wanted it to be loaded/parsed again, you'd have to reload() the module.

The first place checked is sys.modules, the cache of all modules that have been imported previously. [source]


Part 2

from foo import * imports a to the local scope. When assigning a value to a, it is replaced with the new value - but the original foo.a variable is not touched.

So unless you import foo and modify foo.a, both calls will return the same value.

For a mutable type such as a list or dict it would be different, modifying it would indeed affect the original variable - but assigning a new value to it would still not modify foo.whatever.

If you want some more detailed information, have a look at http://docs.python.org/reference/executionmodel.html:

The following constructs bind names: formal parameters to functions, import statements, class and function definitions (these bind the class or function name in the defining block), and targets that are identifiers if occurring in an assignment, for loop header, in the second position of an except clause header or after as in a with statement.

The two bold sections are the relevant ones for you: First the name a is bound to the value of foo.a during the import. Then, when doing a = 5, the name a is bound to 5. Since modifying a list/dict does not cause any binding, those operations would modify the original one (b and foo.b are bound to the same object on which you operate). Assigning a new object to b would be a binding operation again and thus separate b from foo.b.

It is also worth noting what exactly the import statement does:

  • import foo binds the module name to the module object in the current scope, so if you modify foo.whatever, you will work with the name in that module - any modifications/assignments will affect the variable in the module.
  • from foo import bar binds the given name(s) only (i.e. foo will remain unbound) to the element with the same name in foo - so operations on bar behave like explained earlier.
  • from foo import * behaves like the previous one, but it imports all global names which are not prefixed with an underscore. If the module defines __all__ only names inside this sequence are imported.

Part 3 (which doesn't even exist in your question :p)

The python documentation is extremely good and usually verbose - you find answer on almost every possible language-related question in there. Here are some useful links:

  • http://docs.python.org/reference/datamodel.html (classes, properties, magic methods, etc.) ()
  • http://docs.python.org/reference/executionmodel.html (how variables work in python)
  • http://docs.python.org/reference/expressions.html
  • http://docs.python.org/reference/simple_stmts.html (statements such as import, yield)
  • http://docs.python.org/reference/compound_stmts.html (block statements such as for, try, with)

What exactly does import * import?

The "advantage" of from xyz import * as opposed to other forms of import is that it imports everything (well, almost... [see (a) below] everything) from the designated module under the current module. This allows using the various objects (variables, classes, methods...) from the imported module without prefixing them with the module's name. For example

>>> from math import *
>>>pi
3.141592653589793
>>>sin(pi/2)
>>>1.0

This practice (of importing * into the current namespace) is however discouraged because it

  • provides the opportunity for namespace collisions (say if you had a variable name pi prior to the import)
  • may be inefficient, if the number of objects imported is big
  • doesn't explicitly document the origin of the variable/method/class (it is nice to have this "self documentation" of the program for future visit into the code)

Typically we therefore limit this import * practice to ad-hoc tests and the like. As pointed out by @Denilson-Sá-Maia, some libraries such as (e.g. pygame) have a sub-module where all the most commonly used constants and functions are defined and such sub-modules are effectively designed to be imported with import *. Other than with these special sub-modules, it is otherwise preferable to ...:

explicitly import a few objects only

>>>from math import pi
>>>pi
>>>3.141592653589793
>>> sin(pi/2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'sin' is not defined

or import the module under its own namespace (or an alias thereof, in particular if this is a long name, and the program references its objects many times)

  >>>import math
>>>math.pi
>>>3.141592653589793
etc..

>>>import math as m #bad example math being so short and standard...
>>>m.pi
>>>3.141592653589793
etc..

See the Python documentation on this topic

(a) Specifically, what gets imported with from xyz import * ?

if xyz module defines an __all__ variable, it will import all the names defined in this sequence, otherwise it will import all names, except these which start with an underscore.

Note Many libraries have sub-modules. For example the standard library urllib includes sub-modules like urllib.request, urllib.errors, urllib.response etc. A common point of confusion is that

from urllib import *

would import all these sub-modules. That is NOT the case: one needs to explicitly imports these separately with, say, from urllib.request import * etc. This incidentally is not specific to import *, plain import will not import sub-modules either (but of course, the * which is often a shorthand for "everything" may mislead people in thinking that all sub-modules and everything else would be imported).

Why is Python running my module when I import it, and how do I stop it?

Because this is just how Python works - keywords such as class and def are not declarations. Instead, they are real live statements which are executed. If they were not executed your module would be empty.

The idiomatic approach is:

# stuff to run always here such as class/def
def main():
pass

if __name__ == "__main__":
# stuff only to run when not called via 'import' here
main()

It does require source control over the module being imported, however.

Does from-import exec the whole module?

Looks like like the answer is yes:

$ echo 'print "test"
def f1():
print "f1"

def f2():
print "f2"

' > util.py

$ echo 'from util import f1
f1()
from util import f2
f2()
' > test.py

$ python test.py
test
f1
f2

$

How does circular import work exactly in Python

What you're missing is the fact that from X import Y, does not solely imports Y. It imports the module X first. It's mentioned in the page:

from X import a, b, c imports the module X, and creates references in the current namespace to the given objects. Or in other words, you can now use a and b and c in your program.

So, this statement:

from test import h

Does not stop importing when it reaches the definition of h.

Let's change the file:

test.py

h = 3
if __name__ != '__main__': #check if it's imported
print('I'm still called!')
...

When you run test.py, you'll get I'm still called! before the error.

The condition checks whether the script is imported or not. In your edited code, if you add the condition, you'll print it only when it acts as the main script, not the imported script.

Here is something to help:

  1. test imports test2 (h is defined)
  2. test2 imports test, then it meets the condition.
  3. The condition is false - test is imported -, so, test2 is not going to look for test2.j - it doesn't exist just yet.

Hope this helps!

Use 'import module' or 'from module import'?

The difference between import module and from module import foo is mainly subjective. Pick the one you like best and be consistent in your use of it. Here are some points to help you decide.

import module

  • Pros:
    • Less maintenance of your import statements. Don't need to add any additional imports to start using another item from the module
  • Cons:
    • Typing module.foo in your code can be tedious and redundant (tedium can be minimized by using import module as mo then typing mo.foo)

from module import foo

  • Pros:
    • Less typing to use foo
    • More control over which items of a module can be accessed
  • Cons:
    • To use a new item from the module you have to update your import statement
    • You lose context about foo. For example, it's less clear what ceil() does compared to math.ceil()

Either method is acceptable, but don't use from module import *.

For any reasonable large set of code, if you import * you will likely be cementing it into the module, unable to be removed. This is because it is difficult to determine what items used in the code are coming from 'module', making it easy to get to the point where you think you don't use the import any more but it's extremely difficult to be sure.

`from ... import` vs `import .`

It depends on how you want to access the import when you refer to it.

from urllib import request
# access request directly.
mine = request()

import urllib.request
# used as urllib.request
mine = urllib.request()

You can also alias things yourself when you import for simplicity or to avoid masking built ins:

from os import open as open_
# lets you use os.open without destroying the
# built in open() which returns file handles.

How do I import other Python files?

importlib was added to Python 3 to programmatically import a module.

import importlib

moduleName = input('Enter module name:')
importlib.import_module(moduleName)

The .py extension should be removed from moduleName. The function also defines a package argument for relative imports.

In python 2.x:

  • Just import file without the .py extension
  • A folder can be marked as a package, by adding an empty __init__.py file
  • You can use the __import__ function, which takes the module name (without extension) as a string extension
pmName = input('Enter module name:')
pm = __import__(pmName)
print(dir(pm))

Type help(__import__) for more details.

Relative imports for the billionth time

Script vs. Module

Here's an explanation. The short version is that there is a big difference between directly running a Python file, and importing that file from somewhere else. Just knowing what directory a file is in does not determine what package Python thinks it is in. That depends, additionally, on how you load the file into Python (by running or by importing).

There are two ways to load a Python file: as the top-level script, or as a
module. A file is loaded as the top-level script if you execute it directly, for instance by typing python myfile.py on the command line. It is loaded as a module when an import statement is encountered inside some other file. There can only be one top-level script at a time; the top-level script is the Python file you ran to start things off.

Naming

When a file is loaded, it is given a name (which is stored in its __name__ attribute).

  • If it was loaded as the top-level script, its name is __main__.
  • If it was loaded as a module, its name is [ the filename, preceded by the names of any packages/subpackages of which it is a part, separated by dots ], for example, package.subpackage1.moduleX.

But be aware, if you load moduleX as a module from shell command line using something like python -m package.subpackage1.moduleX, the __name__ will still be __main__.

So for instance in your example:

package/
__init__.py
subpackage1/
__init__.py
moduleX.py
moduleA.py

if you imported moduleX (note: imported, not directly executed), its name would be package.subpackage1.moduleX. If you imported moduleA, its name would be package.moduleA. However, if you directly run moduleX from the command line, its name will instead be __main__, and if you directly run moduleA from the command line, its name will be __main__. When a module is run as the top-level script, it loses its normal name and its name is instead __main__.

Accessing a module NOT through its containing package

There is an additional wrinkle: the module's name depends on whether it was imported "directly" from the directory it is in or imported via a package. This only makes a difference if you run Python in a directory, and try to import a file in that same directory (or a subdirectory of it). For instance, if you start the Python interpreter in the directory package/subpackage1 and then do import moduleX, the name of moduleX will just be moduleX, and not package.subpackage1.moduleX. This is because Python adds the current directory to its search path when the interpreter is entered interactively; if it finds the to-be-imported module in the current directory, it will not know that that directory is part of a package, and the package information will not become part of the module's name.

A special case is if you run the interpreter interactively (e.g., just type python and start entering Python code on the fly). In this case, the name of that interactive session is __main__.

Now here is the crucial thing for your error message: if a module's name has no dots, it is not considered to be part of a package. It doesn't matter where the file actually is on disk. All that matters is what its name is, and its name depends on how you loaded it.

Now look at the quote you included in your question:

Relative imports use a module's name attribute to determine that module's position in the package hierarchy. If the module's name does not contain any package information (e.g. it is set to 'main') then relative imports are resolved as if the module were a top-level module, regardless of where the module is actually located on the file system.

Relative imports...

Relative imports use the module's name to determine where it is in a package. When you use a relative import like from .. import foo, the dots indicate to step up some number of levels in the package hierarchy. For instance, if your current module's name is package.subpackage1.moduleX, then ..moduleA would mean package.moduleA. For a from .. import to work, the module's name must have at least as many dots as there are in the import statement.

... are only relative in a package

However, if your module's name is __main__, it is not considered to be in a package. Its name has no dots, and therefore you cannot use from .. import statements inside it. If you try to do so, you will get the "relative-import in non-package" error.

Scripts can't import relative

What you probably did is you tried to run moduleX or the like from the command line. When you did this, its name was set to __main__, which means that relative imports within it will fail, because its name does not reveal that it is in a package. Note that this will also happen if you run Python from the same directory where a module is, and then try to import that module, because, as described above, Python will find the module in the current directory "too early" without realizing it is part of a package.

Also remember that when you run the interactive interpreter, the "name" of that interactive session is always __main__. Thus you cannot do relative imports directly from an interactive session. Relative imports are only for use within module files.

Two solutions:

  1. If you really do want to run moduleX directly, but you still want it to be considered part of a package, you can do python -m package.subpackage1.moduleX. The -m tells Python to load it as a module, not as the top-level script.

  2. Or perhaps you don't actually want to run moduleX, you just want to run some other script, say myfile.py, that uses functions inside moduleX. If that is the case, put myfile.py somewhere elsenot inside the package directory – and run it. If inside myfile.py you do things like from package.moduleA import spam, it will work fine.

Notes

  • For either of these solutions, the package directory (package in your example) must be accessible from the Python module search path (sys.path). If it is not, you will not be able to use anything in the package reliably at all.

  • Since Python 2.6, the module's "name" for package-resolution purposes is determined not just by its __name__ attributes but also by the __package__ attribute. That's why I'm avoiding using the explicit symbol __name__ to refer to the module's "name". Since Python 2.6 a module's "name" is effectively __package__ + '.' + __name__, or just __name__ if __package__ is None.)



Related Topics



Leave a reply



Submit