Simple documentation for setup.cfg

Have a look at build.sh which copies all this into a build directory.

Yes - that’s the tutorial I linked to in post 6.

I see that you’re setting up the actual package structure for building via the build.sh script. This results in a project tree with the following layout:

build/
├── LICENSE
├── MANIFEST.in
├── README.md
├── pyproject.toml
├── setup.cfg
└── src/
    ├── PopoutApps/
    │   ├── __init__.py
    │   └── qucs_netlist.py
    ├── qucs_netlist.dat
    ├── qucs_netlist.hlp
    └── qucs_netlist.html

The problem is that the qucs_netlist.{dat,hlp,html} files are outside of your actual Python package (the PopoutApps directory), and thus they do not count as package data and will not be included in either your sdist or your wheel. You need to move them inside PopoutApps.

The next step is to correctly tell setuptools to include the files. If you’re using MANIFEST.in, there’s no need for [options.package_data] in setup.cfg, so let’s get rid of the latter. All you need to finish off the use of MANIFEST.in is to set include_package_data = True in the [options] section of setup.cfg, and it should all work out.

2 Likes

But this does point out once again a very confusing aspect of setuptools: if you set options.package_data and also set include_package_data = True, those control separate behaviors and one will be ignored (and as I remember, without any warning).

1 Like

Done. It still doesn’t find the files and they don’t appear in
/home/chris/.local/lib/python3.10/site-packages/PopoutApps
assuming that’s where they should be. Do I need a prefix to the filenames, like the name of the app followed by ‘data’ or anything of that sort? If they appeared in the directory above I might be able to work it out for myself.

Is there a bug report for this? I imagine there would be backward compatibility issues with “just fixing” it, but at least having something to track the issue, discuss how to document the behaviour, etc, would be useful.

1 Like

It looks to me like piecemeal additions to a design from long ago - independently developed without any reference to each other (I’m ready to be told that I’m completely wrong:)).

As a newbie I can’t see why all this information shouldn’t be in one file like a Flatpak manifest.

OK, I’ve figured out the problem. You need to change the contents of your MANIFEST.in to:

include src/PopoutApps/qucs_netlist.dat src/PopoutApps/qucs_netlist.html src/PopoutApps/qucs_netlist.hlp

or, even better:

graft src
global-exclude *.py[cod]

The reason the previous MANIFEST.in was working for me was, I suspect, because my use of one of the above at one point while fiddling around caused the files to be listed in qucs_netlist.egg-info/SOURCES.txt, and then, with nothing deleting that directory, the SOURCES.txt continued to be used when I switched the MANIFEST.in back to the original form.

1 Like

I’ll choose the former as I understand it.

I’ve had problems like this but not been sure what hidden files to delete.

Progress at last! The files are now being put into /home/chris/.local/lib/python3.10/site-packages/PopoutApps. The package still fails, I assume I need to add something to refer to the right directory.

Your package’s code simply refers to the files by their filename, without any directory component, which means Python will look for them in the process’s current working directory (usually the directory from which you ran Python). See the second part of the article I linked for how to properly access package data files at runtime.

Thanks for you help up to this point, you’ve obviously put some effort into understanding my mess. Adding simplicity rather than complexity is always the best way…

The importlib-resources API

That looks like a bundle of laughs. I find it hard to believe that Python packaging can make something as simple as reading a file so complicated.

Rather than spending another week wrestling with that, I’m thinking of printing instructions to go onto Github to download the reference data and helpfile. Apart from the advantage of making it easier for people who might find the package useful, the package would then have no real advantage over just downloading the files into a local directory and running it from there.

[…]

I find it hard to believe that Python packaging can make something
as simple as reading a file so complicated.

I find it hard to believe that you don’t realize providing a
flexible mechanism for including arbitrary files is complicated.
If you’re quite sure there’s a simpler alternative, I’m sure
everyone will look forward to seeing your replacement implementation
for it.

You may be familiar with Debian, RPM and Flatpak, Meson etc.

Meson:

data_files = [
  'myapp.dat'
]

Python program: 
`	homefold = os.getenv('XDG_HOME')`

[…]

Unfortunately you sound as though you are taking this personally,

Only because you implied that the people designing this made it
complex intentionally. The topic is complex, and it’s a question of
how much complexity the implementation hides (and also how much
flexibility it takes away from you when it does so).

but to answer your comment - you may be familiar with Debian, RPM
and Flatpak, Meson etc.

Meson:

data_files = [
  'myapp.dat'
]

Python program: 
`	homefold = os.getenv('XDG_HOME')`

Python 3.9:

import importlib.resources

importlib.resources.files("mypackage") / "myapp.dat"

Have you looked at John’s link?

Blockquote

Accessing Package Data at Runtime

There have been multiple ways to access package data over the years, from pkg_resources’ ResourceManager API to pkgutil.get_data(), but the most recent and currently-recommended way is with the importlib-resources package.

Installing & Importing importlib-resources

There are two versions of importlib-resources available:

Development of the PyPI version tends to be ahead of whatever’s in the latest Python version. In particular, the new files()-based API described here was only introduced in version 1.1.0 of the PyPI project and was only added to the Python standard library in Python 3.9. In order to be guaranteed a version of importlib-resources that supports this API, you should add the following to your project’s install_requires:

importlib-resources>=1.1.0; python_version < ‘3.9’

and import importlib-resources in your code as follows:

import sys if sys.version_info < (3, 9): # importlib.resources either doesn’t exist or lacks the files() # function, so use the PyPI version: import importlib_resources else: # importlib.resources has files(), so use that: import importlib.resources as importlib_resources

The importlib-resources API

To access a package data file in your project, start by calling importlib_resources.files() on the name of your package:

pkg = importlib_resources.files(“packagename”) # The argument can optionally refer to a subpackage in the form # “packagename.subpackage”.

This gives you a Traversable object that acts like a limited pathlib.Path object for traversing package data files. To refer to a data.csv file in a data/ directory in your package, write:

pkg_data_file = pkg / “data” / “data.csv”

So now that we’ve got a reference to the package data file, how do we get anything out of it?

  • To open the file for reading, call the open() method:

with pkg_data_file.open() as fp: # Do things with fp

  • To get the file’s contents as bytes, call the read_bytes() method:

b = pkg_data_file.read_bytes()

  • To get the file’s contents as a str, call the read_text() method, optionally with an encoding argument:

s = pkg_data_file.read_text(encoding=“utf-8”)

  • To get the path to the file, call importlib_resources.as_file() on it and use the return value as a context manager:

with importlib_resources.as_file(pkg_data_file) as path: # Do things with the pathlib.Path object that is path

The use of context managers allows importlib-resources to support packages stored in zipfiles; when a path is requested for a package data file in a zipfile, the library can extract the file to a temporary location at the start of the with block and remove it at the end of the block.

  • To iterate through a directory (either a package or a non-package directory), use the iterdir() method. You can test whether a resource is a directory or a file with the is_dir() and is_file() methods, and you can get a resource’s basename via the name property:

for entry in (pkg / “data”).iterdir(): if entry.is_dir(): print(entry.name, “DIR”) else: print(entry.name, “FILE”)

[…]

Yes, and I read it. Have you? There are no real surprises for me in
there. It covers a number of possible solutions. Pick whichever
one(s) is/are most appropriate for your situation. It also covers a
lot of the history and explanation for what has been changing over
time with regard to this, but that doesn’t mean you need to do all
the things it describes.

If you want to do this in a way which covers multiple older versions
of the language and library, then you’ll essentially be carrying
multiple version-specific solutions, but that’s not much different
from anything else which has changed in Python over the years. With
importlib.resources now hopefully stabilizing in the stdlib since
3.9, the older solutions should fade with time. I’d recommend
writing to current Python and then tossing in some conditionals or
try/except blocks to fall back to what works on older versions you
may still need to support (until the time comes that you’re able to
delete them).

Hi Chris, sorry to hear that you had problems with the docs.

As the others previously mentioned, unfortunately the setup.cfg example shows an outdated configuration option that is deprecated (I submitted a PR to remove this from the example earlier today, so hopefully others will not have the same problem in the future).
As Dustin have mentioned, this is the best place to look:

But I am glad that in the end of the day you managed to get the packaging right.
Regarding the way to read the files, unfortunately it is not trivial because not always the file is available on the disk.

But, between all the options, I would stick with importlib.metadata. Assuming you are using Python>=3.7, you can try the following in your code:

from importlib import resources

filehelp_contents = resources.read_text(__package__, filehelp)

with resources.path(__package__, filebrowser) as filebrowser_path:
    # do something with filebrowser_path inside the with block

If you are not using Python>=3.7, you can try importlib-resources · PyPI.


By the way, in the future, if you are not an experienced user, it might just be easier to avoid creating the package structure on the fly and stick with things already present in the file system.

There is a bunch of templates that will make that easier for you, for example with cookiecutter-pypackage or PyScaffold.

3 Likes

Please use the new API in favor of the legacy one:

from importlib import resources

filehelp_traversable = resources.files(__package__) / filehelp

with resources.as_path(filehelp_traversable) as filebrowser_path:
    # do something with filebrowser_path inside the with block
2 Likes

That is correct, include_package_data overrules package_data without any warning. Nearly everything that’s got to do with package and package data handling in setuptools is totally non-obvious and inelegant, but setuptools does provide a lot of flexibility in that regard.

1 Like
2 Likes

Thanks for posting the issue link, I’d just found it myself. The worst part of this from my point of view is that include_package_data and package_data have a different meaning of ‘package data’. For one, it can be any file from anywhere in the source tree, and for the other the files have to be colocated in the package source directory.

1 Like