PYTHONPATH vs. sys.path
If the only reason to modify the path is for developers working from their working tree, then you should use an installation tool to set up your environment for you. virtualenv is very popular, and if you are using setuptools, you can simply run setup.py develop
to semi-install the working tree in your current Python installation.
sys.path vs. $PATH
you can read environment variables accessing to the os.environ
dictionary
import os
my_path = os.environ['PATH']
about searching where a Package is installed, it depends if is installed in PATH
Difference between $PATH, sys.path and os.environ
This is actually more complicated than it would seem. It's unclear by the question if you understand the Linux/MacOS $PATH environment variable. Lets start there. The $PATH variable (in Python you're able to access the system environement variables from os.environ) denotes the current users $PATH variable as defined in various shell profile and environment files. It typically contains things like "/usr/bin" and other places where programs are installed. For example when you type "ls" into the system shell, the underlying system searches the $PATH for programs named "ls". So what actually gets executed is probably something like "/usr/bin/ls" I've included additional reading below.
sys.path on the other hand is constructed by Python when the interpreter is started, based on a number of things. The first sentence in the help page is as follows. "A list of strings that specifies the search path for modules. Initialized from the environment variable $PYTHONPATH, plus an installation-dependent default." The installation-dependent portion typically defines the installation location of Python site packages. $PYTHONPATH is another environment variable (like $PATH) which can be added to facilitate the module search location and can be set the same way the system $PATH can
Typically if you have non-installed sources (ie you have Python files that you want to run outside the site-packages directory) you typically need to manipulate sys.path either directly in your scripts or add the location to the $PYTHONPATH environment variable so the interpreter knows where to find your modules. Alternatively, you could use .pth files to manipulate the module search path as well
This is just a basic overview, I hope you read the docs for better understanding
Sources
- Linux $PATH variable information
- Python sys.path
- Python site.py
PYTHONPATH vs. sys.path (RELOADED)
The better way of doing this now is to use pip install with the -e option.
pip install -e .
It uses a directory with the setup.py file. The "." indicates this directory. This works the same way as the setuptools develop method.
I believe that the develop creates an egg link in your sight packages folder which points to the folder of the library. http://pythonhosted.org/setuptools/setuptools.html#develop-deploy-the-project-source-in-development-mode
python setup.py develop
I believe this is why you get the absolute path. There may be a conflict with a develop link and an install. Things could have also been moved.
For double clicking just have something that checks sys.argv. If there is no value for sys.argv[1] append build, install, or develop.
In addition, I've always heard that you want to import the modules then call the functions from the modules. from package import lib. lib.foo() that way you know where the method came from. I believe the import does the same thing for both ways; this may clean up your import. Python pathing and packaging can be a pain.
from package import lib
lib.foo()
PYTHONPATH sys.path difference
There are several reasons why a path may show up. Make sure you don't hit one of these:
The path must exist, non-existing paths are ignored. From the
PYTHONPATH
documentation:Non-existent directories are silently ignored.
Duplicates are removed (the first entry is kept); paths are made absolute (relative to the current working directory) and compared case-insensitively on platforms where this matters.
So if you have a relative path that comes down to the same absolute path in your
sys.path
, only the first entry is kept.After normilization and cleanup, the
site
module tries to importsitecustomize
andusercustomize
modules. These could manipulatesys.path
too.
You can take a closer look at your sys.path
right after cleaning and if there is a usercustomize
module to be imported by running the site
module as a command line tool:
python -m site
It'll print out your sys.path
in a readable one-line-per-entry format.
which python vs PYTHONPATH
You're mixing 2 environment variables:
PATH
wherewhich
looks up for executables when they're accessed by name only. This variable is a list (colon/semi-colon separated depending on the platform) of directories containing executables. Not python specific.which python
just looks in this variable and prints the full pathPYTHONPATH
is python-specific list of directories (colon/semi-colon separated likePATH
) where python looks for packages that aren't installed directly in the python distribution. The name & format is very close to system/shellPATH
variable on purpose, but it's not used by the operating system at all, just by python.
Why use sys.path.append(path) instead of sys.path.insert(1, path)?
If you have multiple versions of a package / module, you need to be using virtualenv (emphasis mine):
virtualenv
is a tool to create isolated Python environments.The basic problem being addressed is one of dependencies and versions, and indirectly permissions. Imagine you have an application that needs version 1 of LibFoo, but another application requires version 2. How can you use both these applications? If you install everything into
/usr/lib/python2.7/site-packages
(or whatever your platform’s standard location is), it’s easy to end up in a situation where you unintentionally upgrade an application that shouldn’t be upgraded.Or more generally, what if you want to install an application and leave it be? If an application works, any change in its libraries or the versions of those libraries can break the application.
Also, what if you can’t install packages into the global
site-packages
directory? For instance, on a shared host.In all these cases,
virtualenv
can help you. It creates an environment that has its own installation directories, that doesn’t share libraries with other virtualenv environments (and optionally doesn’t access the globally installed libraries either).
That's why people consider insert(0,
to be wrong -- it's an incomplete, stopgap solution to the problem of managing multiple environments.
Is PYTHONPATH consistent across multiple import statements even if some of them manipulate the sys.path?
If your module y changes sys.path
, the value will be the same on your A.py script even if you execute importlib.reload(sys)
So imagine module 'y' executes
from sys import path
path.clear()
In your A.py script:
import sys, importlib
import x, y
importlib.reload(sys)
print(sys.path) # is []
import z
The module z will not be found.
To fix this you can restore your script sys.path
variable to the same value assigned at the beginning by the interpreter.
From the documentation:
A list of strings that specifies the search path for modules. Initialized from the environment variable PYTHONPATH, plus an installation-dependent default.
And...
As initialized upon program startup, the first item of this list, path[0], is the directory containing the script that was used to invoke the Python interpreter
Lets assume that the interpreter is not running in interactive mode or read from stdin (Its executing a file script) and its located on the current working directory
Our A.py could look like:
import importlib
import x, y
# We can still load (sys, os, ...)
from sys import path
from os import getcwd
import site
print(sys.path) # []
path.append(getcwd()) # Add directory where script is executed
path.append(os.environ.get('PYTHONPATH')) # Add PYTHONPATH
site.main() # Add site packages
import z # Now this dont fail
Note: Even removing all sys.path
items, importlib
is able to locate the packages os
, site
, sys
, ...
This is because importlib
uses sys.modules
to access such packages:
From importlib.find_loader documentation:
If the module is in sys.modules, then sys.modules[name].loader is returned
And from the sys.modules documentation:
This is a dictionary that maps module names to modules which have already been loaded.
EDIT:
This is a tricky solution that you can use to solve this problem: You can create a function which is invoked everytime you load a module. The function checks if
sys.path
is changed after the module is loaded.If true, set it to its original value
from copy import copy
import warnings
import sys
sys.path = list(sys.path)
_original_path = copy(sys.path)
_base_import = __import__
def _import(*args, **kwargs):
try:
module = _base_import(*args, **kwargs)
return module
finally:
if type(sys.path) != list or sys.path != _original_path:
warnings.warn('System path was modified', Warning)
# Restore path
sys.path = copy(_original_path)
__builtins__.__import__ = _import
And now execute this code:
import sys
before = copy(sys.path)
import y # 'y' tries to change sys.path
after = copy(sys.path)
print(before == after) # True
It will also display a warning message on stdout
EDIT #2 (Another solution):
This works only on python >=3.7 because it relies on PEP 562
Here I basically replace the module 'sys' so that i can avoid external modules to change the actual sys.path
First create a script with the next code (proxy.py):
import importlib
from sys import path, modules
from copy import copy
path = copy(path)
modules = copy(modules)
def __getattr__(name):
if name in globals():
return getattr(globals(), name)
return getattr(importlib.import_module('sys'), name)
def __dir__():
return dir(importlib.import_module('sys'))
Now, on your A.py, put the next code:
import proxy
import sys
sys.modules['sys'] = proxy
import y # y imports 'sys' but import sys returns the 'proxy' module
# 'y' thinks he changes sys.path but it only modifies proxy.path
print(proxy.path) # []
print(sys.path) # Unchanged
Code on y module:
import sys
sys.path.clear() # a.k: proxy.path.clear()
# You can still access to all properties from the sys module
print(dir(sys)) # ['ps1', 'ps2', 'platform', ...]
Effect of using sys.path.insert(0, path) and sys.path(append) when loading modules
Because python checks in the directories in sequential order starting at the first directory in sys.path
list, till it find the .py
file it was looking for.
Ideally, the current directory or the directory of the script is the first always the first element in the list, unless you modify it, like you did. From documentation -
As initialized upon program startup, the first item of this list, path[0], is the directory containing the script that was used to invoke the Python interpreter. If the script directory is not available (e.g. if the interpreter is invoked interactively or if the script is read from standard input), path[0] is the empty string, which directs Python to search modules in the current directory first. Notice that the script directory is inserted before the entries inserted as a result of PYTHONPATH.
So, most probably, you had a .py
file with the same name as the module you were trying to import from, in the current directory (where the script was being run from).
Also, a thing to note about ImportError
s , lets say the import error says -ImportError: No module named main
- it doesn't mean the main.py
is overwritten, no if that was overwritten we would not be having issues trying to read it. Its some module above this that got overwritten with a .py
or some other file.
Example -
My directory structure looks like -
- test
- shared
- __init__.py
- phtest.py
- testmain.py
Now From testmain.py
, I call from shared import phtest
, it works fine.
Now lets say I introduce a shared.py in test
directory` , example -
- test
- shared
- __init__.py
- phtest.py
- testmain.py
- shared.py
Now when I try to do from shared import phtest
from testmain.py
, I will get the error -
ImportError: cannot import name 'phtest'
As you can see above, the file that is causing the issue is shared.py
, not phtest.py
.
Related Topics
Calculate Mean Across Dimension in a 2D Array
How Include Static Files to Setuptools - Python Package
How to Convert Columns into One Datetime Column in Pandas
Getting Console.Log Output from Chrome with Selenium Python API Bindings
Pylab.Ion() in Python 2, Matplotlib 1.1.1 and Updating of the Plot While the Program Runs
What Is Python Whitespace and How Does It Work
Get a Function Argument's Default Value
Split a String with Unknown Number of Spaces as Separator in Python
Python: Can Executable Zip Files Include Data Files
Find Length of Sequences of Identical Values in a Numpy Array (Run Length Encoding)
Importerror: No Module Named _Ssl
How to Find Most Common Elements of a List
How to Kill a Process on Windows from Within Python
Python: Get the Print Output in an Exec Statement
Python Replace Single Backslash with Double Backslash