I have a new library that has to include a lot of subfolders of small datafiles, and I\'m trying to add them as package data. Imagine I have my library as so:
__init__.py
.Generate the lists of files and directories using standard Python code, instead of writing it literally:
data_files = []
directories = glob.glob('data/subfolder?/subfolder??/')
for directory in directories:
files = glob.glob(directory+'*')
data_files.append((directory, files))
# then pass data_files to setup()
@gbonetti's answer, using a recursive glob pattern, i.e. **
, would be perfect.
However, as commented by @daniel-himmelstein, that does not work yet in setuptools package_data
.
So, for the time being, I like to use the following workaround, based on pathlib
's Path.glob():
def glob_fix(package_name, glob):
# this assumes setup.py lives in the folder that contains the package
package_path = Path(f'./{package_name}').resolve()
return [str(path.relative_to(package_path))
for path in package_path.glob(glob)]
This returns a list of path strings relative to the package path, as required.
Here's one way to use this:
setuptools.setup(
...
package_data={'my_package': [*glob_fix('my_package', 'my_data_dir/**/*'),
'my_other_dir/some.file', ...], ...},
...
)
The glob_fix()
can be removed as soon as setuptools supports **
in package_data
.