How to Install Writable Shared and User Specific Data Files with Setuptools

How to include package data with setuptools/distutils?

I realize that this is an old question, but for people finding their way here via Google: package_data is a low-down, dirty lie. It is only used when building binary packages (python setup.py bdist ...) but not when building source packages (python setup.py sdist ...). This is, of course, ridiculous -- one would expect that building a source distribution would result in a collection of files that could be sent to someone else to built the binary distribution.

In any case, using MANIFEST.in will work both for binary and for source distributions.

How to add platform-specific package data in setup.py?

This is the solution I am currently using for pypdfium2:

  • Create a class of supported platforms whose values correspond to the data directory names:
class PlatformNames:
darwin_x64 = "darwin_x64"
linux_x64 = "linux_x64"
windows_x64 = "windows_x64"
# ...
sourcebuild = "sourcebuild"
  • Wrap setuptools.setup() with a function that takes the platform name as argument and copies platform-dependent files into the source tree as required:
# A list of non-python file names to consider for inclusion in the installation, e. g.
Libnames = (
"somelib.so",
"somelib.dll",
"somelib.dylib",
)

# _clean() removes possible old binaries/bindings
# _copy_bindings() copies the new stuff into the source tree
# _get_bdist() returns a custom `wheel.bdist_wheel` subclass with the `get_tag()` and `finalize_options()` functions overridden so as to tag the wheels according to their target platform.

def mkwheel(pl_name):
_clean()
_copy_bindings(pl_name)
setuptools.setup(
package_data = {"": Libnames},
cmdclass = {"bdist_wheel": _get_bdist(pl_name)},
# ...
)
# not cleaning up afterwards so that editable installs work (`pip3 install -e .`)
  • In setup.py, query for a custom environment variable defining the target platform (e. g. $PYP_TARGET_PLATFORM).
    • If set to a value that indicates the need for a source distribution (e. g. sdist), run the raw setuptools.setup() function without copying in any build artifacts.
    • If set to a platform name, build for the requested platform. This makes packaging platform-independent and avoids the need for native hosts to craft the wheels.
    • If not set, detect the host platform using sysconfig.get_platform() and call mkwheel() with the corresponding PlatformNames member.
      • In case the detected platform is not supported, trigger code that performs a source build, moves the created files into data/sourcebuild/ and runs mkwheel(PlatformNames.sourcebuild).
  • Write a script that iterates through the platform names, sets your environment variable and runs python3 -m build --no-isolation --skip-dependency-check --wheel for each. Also invoke build once with --sdist instead of --wheel and the environment variable set to the value for source distribution.

→ If all goes well, the platform-specific wheels and a source distribution will be written into dist/.

Perhaps this is a lot easier to understand just by looking at pypdfium2's code (especially setup.py, setup_base.py and craft_packages.py).

Disclaimer: I am not experienced with the setup infrastructure of Python and merely wrote this code out of personal need. I acknowledge that the approach is a bit "hacky". If there is a possibility to achieve the same goal while using the setuptools API in a more official sort of way, I'd be interested to hear about it.

Update 1: A negative implication of this concept is that the content wrongly ends up in a purelib folder, although it should be platlib as per PEP 427. I'm not sure how to instruct wheel/setuptools differently. Luckily, this is rather just a cosmetic problem.

Update 2: Found a fix to the purelib problem:

class BinaryDistribution (setuptools.Distribution):
def has_ext_modules(self):
return True

setuptools.setup(
# ...
distclass = BinaryDistribution,
)

Setup.py - Add data files inside package in setuptools

Moving both config and doc dirs under mypackage (the one that is actually a package, containing an __init__.py) should fix the issue. The changed directory structure from the question:

mypackage/
├── mypackage/
│ ├── __init__.py
| ├── config/
| | └── config.json
| ├── docs/
| | ├── __init__.py
| | └── doc_folder/
| | └── text_file.txt
| └── main.py
├── setup.py
└── MANIFEST.in

setuptools: adding additional files outside package

There is also data_files

data_files=[("yourdir",
["additionalstuff/moredata.txt", "INFO.txt"])],

Have a think about where you want to put those files. More info in the docs.

How include static files to setuptools - python package

As pointed out in the comments, there are 2 ways to add the static files:

1 - include_package_data=True + MANIFEST.in

A MANIFEST.in file in the same directory of setup.py that looks like this:

include src/static/*
include src/Potato/*.txt

With include_package_data = True in setup.py.

2 - package_data in setup.py

package_data = {
'static': ['*'],
'Potato': ['*.txt']
}

Specify the files inside the setup.py.



Do not use both include_package_data and package_data in setup.py.

include_package_data will nullify the package_data information.

Official docs:

https://setuptools.readthedocs.io/en/latest/userguide/datafiles.html

How to add package data recursively in Python setup.py?

  1. Use Setuptools instead of distutils.
  2. Use data files instead of package data. These do not require __init__.py.
  3. Generate the lists of files and directories using standard Python code, instead of writing it literally:

    data_files = []
    directories = glob.glob('data/subfolder?/subfolder??/')
    for directory in directories:
    files = glob.glob(directory+'*')
    data_files.append((directory, files))
    # then pass data_files to setup()

Add custom action to setup.py

It is not a good idea to do any customization at install time. It is good practice to do customization at run time, usually at the start of the first run.

At the start of your program, you should check if login and pass are somehow available. If login and pass are not available, then ask the user to enter them and save the values in a file. Usually such files should be saved in user configuration directory. Typically you would use the platformdirs library to get the right location for such a file.

Something like that:

import pathlib

import platformdirs

user_config_dir = platformdirs.user_config_dir('MyApp', 'tibhar940')
user_config_path = pathlib.Path(user_config_dir, 'config.cfg')

if user_config_path.is_file():
# read
else:
# prompt the user and save in file

Related:

  • How to setup application to personalize it?
  • How to install writable shared and user specific data files with setuptools?

Including non-Python files with setup.py

Probably the best way to do this is to use the setuptools package_data directive. This does mean using setuptools (or distribute) instead of distutils, but this is a very seamless "upgrade".

Here's a full (but untested) example:

from setuptools import setup, find_packages

setup(
name='your_project_name',
version='0.1',
description='A description.',
packages=find_packages(exclude=['ez_setup', 'tests', 'tests.*']),
package_data={'': ['license.txt']},
include_package_data=True,
install_requires=[],
)

Note the specific lines that are critical here:

package_data={'': ['license.txt']},
include_package_data=True,

package_data is a dict of package names (empty = all packages) to a list of patterns (can include globs). For example, if you want to only specify files within your package, you can do that too:

package_data={'yourpackage': ['*.txt', 'path/to/resources/*.txt']}

The solution here is definitely not to rename your non-py files with a .py extension.

See Ian Bicking's presentation for more info.

UPDATE: Another [Better] Approach

Another approach that works well if you just want to control the contents of the source distribution (sdist) and have files outside of the package (e.g. top-level directory) is to add a MANIFEST.in file. See the Python documentation for the format of this file.

Since writing this response, I have found that using MANIFEST.in is typically a less frustrating approach to just make sure your source distribution (tar.gz) has the files you need.

For example, if you wanted to include the requirements.txt from top-level, recursively include the top-level "data" directory:

include requirements.txt
recursive-include data *

Nevertheless, in order for these files to be copied at install time to the package’s folder inside site-packages, you’ll need to supply include_package_data=True to the setup() function. See Adding Non-Code Files for more information.

Accessing data files before and after distutils/setuptools

I've used a utility method called data_file:

def data_file(fname):
"""Return the path to a data file of ours."""
return os.path.join(os.path.split(__file__)[0], fname)

I put this in the init.py file in my project, and then call it from anywhere in my package to get a file relative to the package.

Setuptools offers a similar function, but this doesn't need setuptools.



Related Topics



Leave a reply



Submit