Do I Need to Import Submodules Directly

Python: importing a sub‑package or sub‑module

You seem to be misunderstanding how import searches for modules. When you use an import statement it always searches the actual module path (and/or sys.modules); it doesn't make use of module objects in the local namespace that exist because of previous imports. When you do:

import package.subpackage.module
from package.subpackage import module
from module import attribute1

The second line looks for a package called package.subpackage and imports module from that package. This line has no effect on the third line. The third line just looks for a module called module and doesn't find one. It doesn't "re-use" the object called module that you got from the line above.

In other words from someModule import ... doesn't mean "from the module called someModule that I imported earlier..." it means "from the module named someModule that you find on sys.path...". There is no way to "incrementally" build up a module's path by importing the packages that lead to it. You always have to refer to the entire module name when importing.

It's not clear what you're trying to achieve. If you only want to import the particular object attribute1, just do from package.subpackage.module import attribute1 and be done with it. You need never worry about the long package.subpackage.module once you've imported the name you want from it.

If you do want to have access to the module to access other names later, then you can do from package.subpackage import module and, as you've seen you can then do module.attribute1 and so on as much as you like.

If you want both --- that is, if you want attribute1 directly accessible and you want module accessible, just do both of the above:

from package.subpackage import module
from package.subpackage.module import attribute1
attribute1 # works
module.someOtherAttribute # also works

If you don't like typing package.subpackage even twice, you can just manually create a local reference to attribute1:

from package.subpackage import module
attribute1 = module.attribute1
attribute1 # works
module.someOtherAttribute #also works

Do I need to import submodules directly?

If I want to use a method in foo.bar, do I need to import foo.bar directly or is importing foo sufficient?

You'll need to import the submodule explicitly. Executing import foo.bar will automatically import the parent module foo, and necessarily bind the name foo, but the reverse is not true.

But I could have sworn I've seen code where it's not imported directly and still works fine

Yes. Sometimes accessing a submodule works without the explicit import. This happens when a parent module itself imports the submodules. Never rely on that unless it's documented, because it may be an implementation detail and could change without warning after a library version upgrade.

As an example of a popular library which demonstrates both behaviors, look at requests==2.18.4. This package has submodules called sessions and help (amongst others). Importing requests will make requests.sessions available implicitly, yet requests.help will not be available until explicitly imported. You'll find when the source code of the package init is executed that the sessions submodule gets imported, but the help submodule does not.

This makes sense, because subsequent use of foo.bar requires an attribute access on an existing foo object. Note that from foo.bar import something does not bind the name foo nor foo.bar, though both modules foo and foo.bar are imported and cached into sys.modules.

cannot import submodule from a module

If you want to be able to say ...

from object_detection.utils import utils_image

... then clearly the utils directory must be a subdirectory of the object_detection directory and not a sibling directory, i.e. at the same level.

Now for your other error:

ImportError: attempted relative import with no known parent package

You did not really specify under what circumstances you get this error other than saying "Running above init.py files gives me an error:". But how are you "running" these py files or what does that even mean?

If you are executing a script when this occurs (how else would you be getting this error?), the script must be invoked as a module (because scripts cannot have relative imports -- see below) as follows (we will assume that the script you are trying to execute is test_utils_image.py):

First, the parent directory of object_detection, which is Object_Detection, must be in the system path of directories to be searched for finding modules and packages referenced in import statements. In general, this can be accomplished several ways, for instance

  1. The script you are executing is in Object_Detection (the directory of the script is automatically added to the sys.path list of directories to be searched by the interpreter).
  2. Dynamically appending Object_Detection to the sys.path list of directories at runtime by your script.
  3. Appending Object_Detection to the PYTHONPATH environment variable.

Item 1 above would not be applicable for this specific case since the module we are executing by definition is not in the Object_Detection directory.

Note that if your classes will eventually be installed with pip, then site-packages will be the parent directory of object_detection, which is already in sys.path.

Then you can execute your script as:

python -m tests.test_utils_image

If you want to execute this .py file as a script, for example by right-mouse clicking on it is VS Code, then see Relative imports for the billionth time, in particular the section Scripts can't import relative, which says it all -- it cannot work!

To invoke this as a script, just convert the relative imports to absolute imports. In fact, the PEP 8 Style Guide says:

Absolute imports are recommended, as they are usually more readable and tend to be better behaved (or at least give better error messages) if the import system is incorrectly configured (such as when a directory inside a package ends up on sys.path):

Should I import both module and submodule

Usually, importing a module does import all of its parents, and doesn't import any of its submodules. If you implement everything in pure Python, and don't do anything funky, that's how it works.

But extension modules may work differently. (For example, if you import os, you get os.path.)

And pure Python code can do funky things—e.g., instead of a real package layout on disk, you can write a top-level module that dynamically builds the package, in which case users will have to import that top-level module first (and may or may not get the submodules for free).

It's always safe to explicitly import everything you're going to use directly. And there's no real harm in doing so—an extra line of code, a few nanoseconds for the importer to see that the module is already in the dict and do nothing, that's about it.

And meanwhile, it's clear to the human reader—and to an IDE that doesn't actually do the imports. When you call logging.getLogger, I can see that you did import logger, so I know that logging is a module, not some other global.

But if you really want to, you can learn how each specific package works and use that knowledge. Going in this direction is rarely confusing in practice. And, even in the opposite direction, a lot of real-world code does rely on import os giving you os.path.

In the particular case of logging, I believe all of the examples in the tutorial and cookbook in the docs explicitly import logging, so if you're asking what's most idiomatic, it's probably that.

Python submodule imports using __init__.py

You probably already understand that when you import a module, the interpreter creates a new namespace and executes the code of that module with the new namespace as both the local and global namespace. When the code completes execution, the module name (or the name given in any as clause) is bound to the module object just created within the importing namespace and recorded against its __name__ in sys.modules.

When a qualified name such as package.subpackage.module is imported the first name (package) is imported into the local namespace, then subpackage is imported into package's namespace and finally module is imported into package.subpackage's namespace. Imports using from ... import ... as ... perform the same sequence of operations, but the imported objects are bound directly to names in the importing module's namespace. The fact that the package name isn't bound in your local namespace does not mean it hasn't been imported (as inspection of sys.modules will show).

The __init__.py in a package serves much the same function as a module's .py file. A package, having structure, is written as a directory which can also contain modules (regular .py files) and subdirectories (also containing an __init__.py file) for any sub_packages. When the package is imported a new namespace is created and the package's __init__.py is executed with that namespace as the local and global namespaces. So to answer your problem we can strip your filestore down by omitting the top-level package, which will never be considered by the interpreter when test.py is run as a program. It would then look like this:

test.py
subpackage/
__init__.py
hello_world.py

Now, subpackage is no longer a sub-package, as we have removed the containing package as irrelevant. Focusing on why the do_something name is undefined might help. test.py does not contain any import, and so it's unclear how you are expecting do_something to acquire meaning. You could make it work by using an empty subpackage/__init__.py and then you would write test.py as

from subpackage.hello_world import do_something
do_something()

Alternatively you could use a subpackage/__init__.py that reads

from hello_world import do_something

which establishes the do_something function inside the subpackage namespace when the package is imported. Then use a test.py that imports the function from the package, like this:

from subpackage import do_something
do_something()

A final alternative with the same __init__.py is to use a test.py that simply imports the (sub)package and then use relative naming to access the required function:

import subpackage
subpackage.do_something()

to gain access to it in your local namespace.

With the empty __init__.py this could also be achieved with a test.py reading

import subpackage.hello_world
subpackage.hello_world.do_something()

or even

from subpackage.hello_world import do_something
do_something()

An empty __init__.py will mean that the top-level package namespace will contain only the names of any subpackages the program imports, which allows you to import only the subpackages you require. This gives you control over the namespace of the top-level package.

While it's perfectly possible to define classes and functions in the
__init__.py , a more normal approach is to import things into that namespace from submodules so that importers can just import the top-level package to gain access to its contents with a single-level attribute
reference, or even use from to import only the names you specifically want.

Ultimately the best tool to keep you straight is a clear understanding of how import works and what effect its various forms have on the importing namespace.

In Python, how to know when importing a submodule vs the main module is mandatory?

The short answer is that you should be able to use dir(), but not help().

The long answer:

Let's take an example the multiprocessing module of Python 3.8.5 (which is what I have). The directory structure of my installation is in part:

Python38
Lib
multiprocessing
dummy
__init__.py
connection.py
__init__.py
pool.py

Now I import the multiprocessing module and do a dir against it and observe that neither the dummy nor pool modules appear:

Python 3.8.5 (tags/v3.8.5:580fbb0, Jul 20 2020, 15:57:54) [MSC v.1924 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import multiprocessing
>>> dir(multiprocessing)
['Array', 'AuthenticationError', 'Barrier', 'BoundedSemaphore', 'BufferTooShort', 'Condition', 'Event', 'JoinableQueue', 'Lock', 'Manager', 'Pipe', 'Pool', 'Process', 'ProcessError', 'Queue', 'RLock', 'RawArray', 'RawValue', 'SUBDEBUG', 'SUBWARNING', 'Semaphore', 'SimpleQueue', 'TimeoutError', 'Value', '__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'active_children', 'allow_connection_pickling', 'context', 'cpu_count', 'current_process', 'freeze_support', 'get_all_start_methods', 'get_context', 'get_logger', 'get_start_method', 'log_to_stderr', 'parent_process', 'process', 'reducer', 'reduction', 'set_executable', 'set_forkserver_preload', 'set_start_method', 'sys']

And, sure enough, if I try to access those modules, I get an error:

>>> multiprocessing.pool
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: module 'multiprocessing' has no attribute 'pool'
>>> multiprocessing.dummy
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: module 'multiprocessing' has no attribute 'dummy'

But if I issue help. I get (in part):

>>> help(multiprocessing)
Help on package multiprocessing:

NAME
multiprocessing

MODULE REFERENCE
https://docs.python.org/3.8/library/multiprocessing

The following documentation is automatically generated from the Python
source files. It may be incomplete, incorrect or include features that
are considered implementation detail and may vary between Python
implementations. When in doubt, consult the module reference at the
location listed above.

DESCRIPTION
# Package analogous to 'threading.py' but using processes
#
# multiprocessing/__init__.py
#
# This package is intended to duplicate the functionality (and much of
# the API) of threading.py but uses processes instead of threads. A
# subpackage 'multiprocessing.dummy' has the same API but is a simple
# wrapper for 'threading'.
#
# Copyright (c) 2006-2008, R Oudkerk
# Licensed to PSF under a Contributor Agreement.
#

PACKAGE CONTENTS
connection
context
dummy (package)
forkserver
heap
managers
pool
popen_fork
(rest of listing ommitted)

You can see that dummy and pool are included, but we know that I have to import these submodules explicitly.

Now we notice from the dir listing that context is listed and it is also listed among the packages named in the help listing. So I should be able to access it without any further importing, and I can:

>>> multiprocessing.context
<module 'multiprocessing.context' from 'C:\\Program Files\\Python38\\lib\\multiprocessing\\context.py'>

And finally:

>>> from multiprocessing.pool import Pool
>>> from multiprocessing.dummy import Pool
>>>

Ultimately the documentation should tell you what you need to import if only by presenting examples.



Related Topics



Leave a reply



Submit