Importerror After Cython Embed

ImportError after cython embed

Usually, a Python-interpreter isn't "standalone" and in order to work it needs its standard libraries (for example ctypes (compiled) or site.py (interpreted)) and also path to other site-packages (for example numpy) must be set.

Albeit it is possible to make a Python-interpter fully standalone by freezing the py-modules and merging all c-extensions (see for example this SO-post) into the resulting executable, it is easier to provide the needed installation to the embeded interpeter. One can download files needed for a "standard" installation from python-homepage (at least for windows), see also this SO-question).

Sometimes finding standard modules/site packages doesn't work out of the box: one has to help the interpreter by setting Python-path, i.e. by adding <..>/sometest/lib/python3.5/site-packages (sometest being a virtual environment root-folder) to sys.path either programmatically in the pyx-file or by setting PYTHONPATH-environment variable prior to start.

Read on for more gory details and alternative solutions.


This answer is for Linux and Python3 (Python 3.7), the basic idea is the same for Windows/MacOS, but some details might be different.

Because venv is used we have the following alternative to solve the issue:

  • adding <..>/sometest/lib/python3.5/site-packages (sometest being a virtual environment root-folder) to sys.path either programmatically in the pyx-file or by setting PYTHONPATH-environment variable prior to start.
  • placing the executable with embeded python in a subdirectory of sometest (e.g. bin or creating an own).
  • using virtualenv instead of venv.

Note: For the executable with the embeded python, it doesn't play any role whether the virtual environment (or which) is activated or not.


Why does the above solves the issue in your scenario?

The problem is, that the (embeded) Python-interpreter needs to figure out where following things are:

  • platform independent directory/files, e.g. os.py, argparse.py (mostly everything *.py/ *.pyc). Given sys.prefix, the interpreter can figure out where to find them (i.e. in prefix/lib/pythonX.Y).
  • platform dependent directory/files, e.g. shared libraries. Given sys.exec_prefix the interpreter can figure out where to find them (e.g. shared libraries can be found in in exec_prefix/lib/pythonX.Y/lib-dynload).

The algorithm can be found here and the search is performed, when Py_Initialize is executed. Once these directories are found, sys.path can be constructed.

However, when using venv, there is a pyvenv.cfg-file next to exe or in the parent directory, which ensures that the right Python-Home is found - a good starting point is the home-key in this file.

If Py_NoSiteFlag is not set, Py_Initialize will utilize site.py (it can be found by the interpreter, because sys.prefix is known) , or more precise site.main(), to add site-packages of the virtual environment to sys.path. While doing so, site.py looks for pyvenv.cfg and parses it. However, local site-packages are added to the python-path only when:

If a file named "pyvenv.cfg" exists one directory above
sys.executable, sys.prefix and sys.exec_prefix are set to that
directory and it is also checked for site-packages (sys.base_prefix
and sys.base_exec_prefix will always be the "real" prefixes of the
Python installation).

In your case pyvenv.cfg is not in the directory above, but in the same as the exe - thus the local site-packages, where the libraries were installed via pip, aren't included. Global site-packages aren't included because pyvenv.cfg has key include-system-site-packages = false. Thus there are no site-packages allowed and the installed libraries cannot be found.

However, moving the exe one directory down, would lead to inclusion of the local site-packages to the path.


There are other scenarios possible, what counts is the location of the executable and not which environment is activated.

A: Executable is somewhere, but not inside a virtual environment

This search heuristic works more or less reliable for installed python-interpreters, but can fall for embeded-interpreters or virtual environments (see this issue for much more information).

If python was installed using usual apt install or similar, then it will be found (due to 4. step in the search algorithm) and the system-installation will be used by the embeded interpreter.

However if files were moved around or python was build from source but not installed, then embeded interperter cannot start up:

Could not find platform independent libraries <prefix>
Could not find platform dependent libraries <exec_prefix>
Consider setting $PYTHONHOME to <prefix>[:<exec_prefix>]
Fatal Python error: initfsencoding: unable to load the file system codec
ModuleNotFoundError: No module named 'encodings'

In this case, Py_SetPythonHome or setting environment variable $PYTHONHOME are possible solutions.

B: Executable inside a virtual environment, created with virtualenv

Assuming it is the same Python version for virtual environment and the embeded python (otherwise we have the above case), the emebeded exe will use local side-packages. The home search algorithmus will always find the local home, due to this rule:

Step 3. Try to find prefix and exec_prefix relative to argv0_path, backtracking up the path until it is exhausted. This is the most common step to succeed. Note that if prefix and exec_prefix are
different, exec_prefix is more likely to be found; however if
exec_prefix is a subdirectory of prefix, both will be found.

In this case argv0_path is the path to the exe (there is no pyvenv.cfg file!), and the "landmarks" (lib/python$VERSION/os.py and lib/python$VERSION/lib-dynload) will be found, because they are presented as symlinks in the local-home above the exe.

C: Executable two folders deep inside a venv-environment

Going two and not one folder (where it works) down in a venv-environment results in case A: pyvenv.cfg file isn't read while searching for home (too far above), 'venv`-environments lack symlinks to "landmarkers" (localy only side-packages are present) and such step 3 will fail, with 4. step being the only hope.


Corollary: Embeded Python will not work without a right Python-installation, unless among other possibilities:

  • the needed files are packed into lib\pythonX.Y\* next to the embeding executable or somewhere above (and there is no pyvenv.cfg around to mess the search up).

  • or pyvenv.cfg used to point the interpreter to the right location.

Distribute embed-cython-compiled .exe and run another machine without python

I solved adding .\sip.pyd. Then I checked resulting folder size , it was around 20 MB. With Pyinstaller, resulting .exe is about 43 MB

Prevent a Python-embedded to look in my default path C:\Python38 for modules

Thanks to @ead's answer and his link getpath.c finally redirecting to getpathp.c in the case of Windows, we can learn that the rule for building the path for module etc. is:

  • current directory first

  • PYTHONPATH env. variable

  • registry key HKEY_LOCAL_MACHINE\SOFTWARE\Python or the same in HKCU

  • PYTHONHOME env. variable

  • finally:

    Iff - we can not locate the Python Home, have not had a PYTHONPATH
    specified, and can't locate any Registry entries (ie, we have nothing
    we can assume is a good path), a default path with relative entries is
    used (eg. .\Lib;.\DLLs, etc)


Conclusion: in order to debug an embedded version of Python, without interfering with the default system install (C:\Python38 in my case), I finally solved it by temporarily renaming the registry key HKEY_LOCAL_MACHINE\SOFTWARE\Python to HKEY_LOCAL_MACHINE\SOFTWARE\PythonOld.

Side note: I'm not sure I will ever revert this registry key back to normal: my normal Python install shouldn't need it anyway to find its path, since when I run python.exe from anywhere (it is in the PATH for everyday use), it will automatically look in .\Lib\ and .\DLL\ which is correct. I don't see a single use case in which my normal install python.exe wouldn't find its subdir .\Lib\ or .\DLL\ and requiring the registry for this. In which use case would the registry be necessary? if python.exe is started then its path has been found, and it can take its .\Lib subfolder, without help from registry. I think 99,99% of the time this registry feature is doing more harm than good, preventing a Python install to be really "portable" (i.e. that we can move from one folder to another).


Notes:

  • To be 100% sure, I also did this in command line, but I don't think it's necessary:

    set PATH=
    set PYTHONPATH=
    set PYTHONHOME=
  • Might be helpful to do debugging of an embedded Python: import ctypes. If you haven't _ctypes.pyd and libffi-7.dll in your embedded install folder, it should fail. If it doesn't, this means it looks somewhere else (probably in your default system-wide Python install).

Compiling required external modules with cython

From my experience, it is not that straightforward to create a standalone executable from multiples python files (yours or from dependencies like psycopg2).
I would say there are a couple of approaches here I would try:

The first one would be cython_freeze https://github.com/cython/cython/tree/master/Demos/freeze I do not use it myself, so I cannot tell much.

The second one is to use pyinstaller to create such executable. It takes as input the .py or .pyc files and embed them into one executable, together with the python interpreter and required dependencies, so you don't have to install anything on the target machine. Note, however, that your code will run as interpreted python and can be easily decompiled and inspected.

If you really need to compile (cythonize) your code, then you can first cythonize() and the build with setup() your extensions, then run pyinstaller as above (taking care that it doesnt find the .py or .pyc files, just the .pyd or .so extensions) to generate the standalone executable. In both cases, pyinstaller will collect all your dependencies and embed them in the executable (even if it fails, you can tell pyinstaller to embed them with hidden_imports).

There are surely other approaches, like py2exe, but when I researched and played with several technologies some months ago, pyinstaller was the best option for me. I do the process in win, linux and mac without many changes.

EDIT: I didn't realize that the example is python 3. Pyinstaller only works for 2.x now.



Related Topics



Leave a reply



Submit