What Is _Main_.Py

What is __main__.py?

Often, a Python program is run by naming a .py file on the command line:

$ python my_program.py

You can also create a directory or zipfile full of code, and include a __main__.py. Then you can simply name the directory or zipfile on the command line, and it executes the __main__.py automatically:

$ python my_program_dir
$ python my_program.zip
# Or, if the program is accessible as a module
$ python -m my_program

You'll have to decide for yourself whether your application could benefit from being executed like this.


Note that a __main__ module usually doesn't come from a __main__.py file. It can, but it usually doesn't. When you run a script like python my_program.py, the script will run as the __main__ module instead of the my_program module. This also happens for modules run as python -m my_module, or in several other ways.

If you saw the name __main__ in an error message, that doesn't necessarily mean you should be looking for a __main__.py file.

What does if __name__ == __main__: do in Python?

Short Answer

It's boilerplate code that protects users from accidentally invoking the script when they didn't intend to. Here are some common problems when the guard is omitted from a script:

  • If you import the guardless script in another script (e.g. import my_script_without_a_name_eq_main_guard), then the latter script will trigger the former to run at import time and using the second script's command line arguments. This is almost always a mistake.

  • If you have a custom class in the guardless script and save it to a pickle file, then unpickling it in another script will trigger an import of the guardless script, with the same problems outlined in the previous bullet.

Long Answer

To better understand why and how this matters, we need to take a step back to understand how Python initializes scripts and how this interacts with its module import mechanism.

Whenever the Python interpreter reads a source file, it does two things:

  • it sets a few special variables like __name__, and then

  • it executes all of the code found in the file.

Let's see how this works and how it relates to your question about the __name__ checks we always see in Python scripts.

Code Sample

Let's use a slightly different code sample to explore how imports and scripts work. Suppose the following is in a file called foo.py.

# Suppose this is foo.py.

print("before import")
import math

print("before function_a")
def function_a():
print("Function A")

print("before function_b")
def function_b():
print("Function B {}".format(math.sqrt(100)))

print("before __name__ guard")
if __name__ == '__main__':
function_a()
function_b()
print("after __name__ guard")

Special Variables

When the Python interpreter reads a source file, it first defines a few special variables. In this case, we care about the __name__ variable.

When Your Module Is the Main Program

If you are running your module (the source file) as the main program, e.g.

python foo.py

the interpreter will assign the hard-coded string "__main__" to the __name__ variable, i.e.

# It's as if the interpreter inserts this at the top
# of your module when run as the main program.
__name__ = "__main__"

When Your Module Is Imported By Another

On the other hand, suppose some other module is the main program and it imports your module. This means there's a statement like this in the main program, or in some other module the main program imports:

# Suppose this is in some other main program.
import foo

The interpreter will search for your foo.py file (along with searching for a few other variants), and prior to executing that module, it will assign the name "foo" from the import statement to the __name__ variable, i.e.

# It's as if the interpreter inserts this at the top
# of your module when it's imported from another module.
__name__ = "foo"

Executing the Module's Code

After the special variables are set up, the interpreter executes all the code in the module, one statement at a time. You may want to open another window on the side with the code sample so you can follow along with this explanation.

Always

  1. It prints the string "before import" (without quotes).

  2. It loads the math module and assigns it to a variable called math. This is equivalent to replacing import math with the following (note that __import__ is a low-level function in Python that takes a string and triggers the actual import):

# Find and load a module given its string name, "math",
# then assign it to a local variable called math.
math = __import__("math")

  1. It prints the string "before function_a".

  2. It executes the def block, creating a function object, then assigning that function object to a variable called function_a.

  3. It prints the string "before function_b".

  4. It executes the second def block, creating another function object, then assigning it to a variable called function_b.

  5. It prints the string "before __name__ guard".

Only When Your Module Is the Main Program


  1. If your module is the main program, then it will see that __name__ was indeed set to "__main__" and it calls the two functions, printing the strings "Function A" and "Function B 10.0".

Only When Your Module Is Imported by Another


  1. (instead) If your module is not the main program but was imported by another one, then __name__ will be "foo", not "__main__", and it'll skip the body of the if statement.

Always


  1. It will print the string "after __name__ guard" in both situations.

Summary

In summary, here's what'd be printed in the two cases:

# What gets printed if foo is the main program
before import
before function_a
before function_b
before __name__ guard
Function A
Function B 10.0
after __name__ guard
# What gets printed if foo is imported as a regular module
before import
before function_a
before function_b
before __name__ guard
after __name__ guard

Why Does It Work This Way?

You might naturally wonder why anybody would want this. Well, sometimes you want to write a .py file that can be both used by other programs and/or modules as a module, and can also be run as the main program itself. Examples:

  • Your module is a library, but you want to have a script mode where it runs some unit tests or a demo.

  • Your module is only used as a main program, but it has some unit tests, and the testing framework works by importing .py files like your script and running special test functions. You don't want it to try running the script just because it's importing the module.

  • Your module is mostly used as a main program, but it also provides a programmer-friendly API for advanced users.

Beyond those examples, it's elegant that running a script in Python is just setting up a few magic variables and importing the script. "Running" the script is a side effect of importing the script's module.

Food for Thought

  • Question: Can I have multiple __name__ checking blocks? Answer: it's strange to do so, but the language won't stop you.

  • Suppose the following is in foo2.py. What happens if you say python foo2.py on the command-line? Why?

# Suppose this is foo2.py.
import os, sys; sys.path.insert(0, os.path.dirname(__file__)) # needed for some interpreters

def function_a():
print("a1")
from foo2 import function_b
print("a2")
function_b()
print("a3")

def function_b():
print("b")

print("t1")
if __name__ == "__main__":
print("m1")
function_a()
print("m2")
print("t2")

  • Now, figure out what will happen if you remove the __name__ check in foo3.py:
# Suppose this is foo3.py.
import os, sys; sys.path.insert(0, os.path.dirname(__file__)) # needed for some interpreters

def function_a():
print("a1")
from foo3 import function_b
print("a2")
function_b()
print("a3")

def function_b():
print("b")

print("t1")
print("m1")
function_a()
print("m2")
print("t2")
  • What will this do when used as a script? When imported as a module?
# Suppose this is in foo4.py
__name__ = "__main__"

def bar():
print("bar")

print("before __name__ guard")
if __name__ == "__main__":
bar()
print("after __name__ guard")

What is the difference between __init__.py and __main__.py?

__init__.py is run when you import a package into a running python program. For instance, import idlelib within a program, runs idlelib/__init__.py, which does not do anything as its only purpose is to mark the idlelib directory as a package. On the otherhand, tkinter/__init__.py contains most of the tkinter code and defines all the widget classes.

__main__.py is run as '__main__' when you run a package as the main program. For instance, python -m idlelib at a command line runs idlelib/__main__.py, which starts Idle. Similarly, python -m tkinter runs tkinter/__main__.py, which has this line:

from . import _test as main

In this context, . is tkinter, so importing . imports tkinter, which runs tkinter/__init__.py. _test is a function defined within that file. So calling main() (next line) has the same effect as running python -m tkinter.__init__ at the command line.

How do I import an object from main.py (in the root directory), into a module contained in a subdirectory?

If you want to do a relative import without hacking around with sys.path or custom loaders/finders, you need:

  • the top-most directory you import relative to to contain an __init__.py
  • to specify said directory as a package in the path to your __main__ module (i.e. python -m root.stuff.main)

In short, this means you could add a directory project to contain the test_db.py and app, add an __init__.py, and call your program with python -m project.app.stuff.main.

If this doesn't work for you (you mention you don't want to change the project structure) you do have other options.

You could install each as its own package (app, and tests). Create a setup.py/setup.cfg/pyproject.toml, put test_db.py in a package tests, and pip install them as editable packages pip install -e .. This will allow you to import either without relative imports (just import app, and import tests.test_db, regardless of file). This is personally the route I would go, and recommend doing.

Otherwise, if you're just looking for a quick fix, there exists one more easy + hacky solution. You could add the path for test_db.py to sys.path (every path in this list is explored for the target module when importing), so you could import test_db from anywhere. Again, this is very hacky, and I don't recommend it beyond quick sanity checks or extremely urgent patches.

What is __init__.py for?

It used to be a required part of a package (old, pre-3.3 "regular package", not newer 3.3+ "namespace package").

Here's the documentation.

Python defines two types of packages, regular packages and namespace packages. Regular packages are traditional packages as they existed in Python 3.2 and earlier. A regular package is typically implemented as a directory containing an __init__.py file. When a regular package is imported, this __init__.py file is implicitly executed, and the objects it defines are bound to names in the package’s namespace. The __init__.py file can contain the same Python code that any other module can contain, and Python will add some additional attributes to the module when it is imported.

But just click the link, it contains an example, more information, and an explanation of namespace packages, the kind of packages without __init__.py.

Being forced to import files relative to main.py in python

Relative imports only work within a single package. In your example, you have a package app with subpackages foo and bar. As long as a script or module imports app (e.g., import app), then modules in foo and bar can do relative imports.

Scripts are not in packages. This is the same model used generally with executables where the operating system is tasked with finding static dynamic libraries in a common LIB directory. So main.py has no inherent knowledge of the app package.

When python runs a script, it adds its path the sys.path so any module or directory in that path can be imported. In your case, main.py can import foo and import bar. But if you do that, foo and bar are separate packages and relative imports won't work between them.

You can solve the problem by moving main.py up one directory so that it can import app. More generally, you could move main.py into a script directory in your source tree and wrap your package + scripts into an installable distribution.

Another popular option when making a package installable, is to add entry point declarations to setup.py. The installer will add small dummy files in the operation system path that essentaily do python3 -m app.main to run the module within the package. Done this way, main.py is a part of the package and can do relative imports.



Related Topics



Leave a reply



Submit