Help testing experimental features in setuptools

Hello all.

For the past months we have been experimenting with adding some features to setuptools, specially the support for project metadata in pyproject.toml (initially introduced in PEP 621).

Other features complement it: automatic layout discovery (for flat, src or single-module layouts), defaulting to include_package_data=True, etc… The idea is to make the [tool.setuptools] table in pyproject.toml not necessary that for the simple use cases.

If anyone is interested in helping testing these features in their packages, before they are merged on the main branch, it would be super helpful!

How can you help?

If you have a project that uses setuptools, you can help by doing the tests bellow (or similar):

  1. In your pyproject.toml you can try replacing requires = ["setuptools",...] with something similar to the following:

    [build-system]
    requires = [
        "setuptools @ git+https://github.com/pypa/setuptools@experimental/support-pyproject",
        ...
    ]
    

    and running your build step (e.g.: python -m build).
    This is a simple test to try identify regressions when users still want to keep using setup.cfg

  2. After doing the change in 1, you can try porting the metadata in your setup.cfg to pyproject.toml and running the build step again (python -m build).

    The ini2toml tool can help you with the conversion.
    The command bellow will print to the stdout most of the config that needs to be added:

    # pipx install 'ini2toml[full]'
    ini2toml setup.cfg
    

    Please double check to make sure everything is correct.
    Also note that ini2toml has some limitations and will not automatically merge the generated content into an existing pyproject.toml (so that step have to be done manually and it is important to keep the URL in [build-system] requires as showed in test 1).
    There is some useful information here:

  3. If everything works correctly with tests 1 and 2, you can also experiment with the automatic discovery.
    Configurations like the following should be no longer necessary (so you can try removing them from pyproject.toml):

    # pyproject.toml
    [tool.setuptools]
    include-package-data = true
    package-dir = {"" = "src"}
    
    [tool.setuptools.packages.find]
    where = ["src"]
    exclude = ["tests"]
    namespaces = true
    

    You can also try removing py_modules = or packages = to see if the new implementation manages to find the correct modules/packages.
    There is some information about the automatic discovery here.

Please let me know if something breaks :smile:

Notes

After each test, it might be a good idea to inspect the generated distribution archives:

unzip -l dist/*.whl
tar tf dist/*.tar.gz

For maximum isolation you can also try removing build/dist files between each test:

rm -rf build dist *.egg-info  # or src/*.egg-info if you use the src-layout

There is a backward-incopatibility with the auto-discovery:

  • The new implementation will consider implicit layouts by default (as in PEP 420), so if you have extra folders in your project root with Python files in them, you might see extra files in your generated wheel that were not there before.
  • There are some rules in place to exclude common directories, e.g. tools, examples, but it will not cover everything.
  • The way to solve this is by explicitly adding the correct parameters in [tool.setuptools.packages.find] or by renaming your extra folders as “private” folders (using a leading _ or ., e.g. experiments => _experiments).

Currently there is very little documentation about this style of build in setuptools. That is the next step in my plan :grin:




Known issues/non-intuitive behaviour (not blocking)

  1. warning: check: missing required meta-data: url - This warn happens if you don’t specify a Homepage in project.url (or homepage, Home-page etc…)
    The action point here is to make the warning message more specific.
    (I added some clarification about this in pypa/setuptools#3182. While the original warning will still appear, there is a preceding message specifically talking about Homepage)
  2. warning: check: missing meta-data: if 'author' supplied, 'author_email' should be supplied too - see: Prevent warn on false positive for author/maintainer's email by abravalheri · Pull Request #116 · pypa/distutils · GitHub
  3. Auto-discovery considers PEP 420 active by default (this is a desired/intentional behaviour), however this behaviour might be confusing for users using the flat-layout with non-conventional package structure. (To be at the safe side, if more than one top-level package is discovered for the flat-layout, setuptools will now raise an exception: pypa/setuptools#3177)
  4. Missing docs (future work) (There are some docs now: pypa/setuptools#3172)

Existing setuptools issues/limitations/non intuitive behaviour (not related to this change)

  1. When using tool.packages.find.exclude sub-packages don’t get automatically excluded. The provided glob pattern has to match the entire package name.
  2. py.typed files are not included by default.
  3. PKG-INFO file in sdist is missing Requires-Dist fields - although this is not entirely compliant with the core metadata specification, it will work without problems. Setuptools-based sdists use a *.egg-info/requires.txt file to specify dependencies instead of PKG-INFO. The current version of pip is aware of that.
  4. Output/traceback is too big/verbose: setuptools will print log messages to both stdout and stderr depending on the level configuration (DEBUG or >= WARNING). You might see all of them if you don’t redirect the stderr output.
  5. setuptools might require cleaning the project (e.g. rm dist build *.egg-info) before new settings take place.

Limitations in other tools (not related to this change)

  1. ini2toml will not merge an existing pyproject.toml into the generated one
  2. The implementation in this change will automatically set the Home-page core meta data based on the value of project.urls. PyPI might show duplicated links unless you use exactly Homepage in project.urls (no dashes, capital first letter):
    [project.url]
    Homepage = "http://..."
    
9 Likes

I just tried this out on GitHub - jwodder/linesep: Handling lines with arbitrary separators, which is my go-to project for experimenting with packaging stuff. Step 1 worked fine. Step 2 built fine, but I noticed the following:

  • ini2toml dropped the "wheel ~= 0.32" entry from build-system.requires in addition to nullifying the change from step 1.
  • The output from ini2toml lacks a terminating newline.
  • Given how small the project.readme and project.license tables are, I’d prefer to see them rendered as inline tables.
  • The url field in setup.cfg was converted to a project-url named “Homepage”, and this was automatically used in the built metadata as both a project-url and the Home-page field. That seems like an odd way to have users specify the field.

As for step 3, while I was able to get away with commenting out tool.setuptools.package-dir and tool.setuptools.packages.find.namespaces, commenting out any more of the suggested fields failed:

  • With tool.setuptools.packages.find.where = ["src"] commented out (and with the [tool.setuptools.packages.find] header kept), the build failed with “error: package directory ‘src/test’ does not exist”. If I also comment out [tool.setuptools.packages.find], then the build succeeds.
  • With tool.setuptools.package-dir = {"" = "src"} commented out, the build failed with “ModuleNotFoundError: linesep”, apparently while trying to resolve the attr: spec for the version field.

Question: Are there plans to also merge MANIFEST.in into pyproject.toml? It’s needed for more than just package data; e.g., I want to include my docs, CHANGELOG, and tox.ini in sdists.

2 Likes

I also happen to have written my own setuptools plugin that’s like setuptools-scm but more flexible. When I tried these steps out on a project that uses it, although the version field was correctly marked dynamic and left out of the [tool.setuptools.dynamic] field, when I tried building after step 2, my plugin was completely ignored! Are setuptools.finalize_distribution_options entry points no longer honored?

Thanks @abravalheri for working on this!

I tried this out on mdformat.

Here’s my notes:

  • I needed to add a MANIFEST.in file to include py.typed. Maybe this is a separate discussion, but it might be convenient to include this by default?
    • After adding MANIFEST.in I did have py.typed in the source distribution, but not the wheel. AFAIK both flit and poetry add this in a wheel, and mypy needs it to be there.
  • dependencies were not added to the source distribution’s PKG-INFO. I.e. Requires-Dist fields were missing.
  • Description-Content-Type in PKG-INFO was set to reST even though the file read is README.md (a Markdown file, not reST)

Other than that all seems good to me!

2 Likes

Thank you very much for the help and the feedback @jwodder.

Currently ini2toml have some limitations, e.g. it does not support merging documents :sweat_smile:, so the automatic conversion is only “informational” for the time being (I am afraid it still requires manual merging the existing pyproject.toml file). Hopefully that will change in the future.

Yeap, this behaviour is a bit quirk. I don’t recall PEP 621 having an equivalent for the Home-page core metadata, which make the things worse. I thought that leaving it in both places would be the less problematic approach. Do you think it should be different?

Regarding step 3, my understanding is that I need to clarify more the docs, or raise a warning when no fields are given for find.

What is happening in your example is a combination of 2 effects:

  • The presence of [tool.setuptools.packages.find] will cause setuptools to search for packages when the configuration file is processed.
  • For backward compatibility, the auto-discovery will just quick in afterwards, if the user has not specified both packages and py_modules. So in your example, the auto-discovery will never be activated.

For the time being, I am not considering that, but this can be proposed in the issue tracker. I would like to see the opinions of the other setuptools maintainers before diving in :smile:

I have to double check that. I have not explicitly removed any entry-point and I imagine that the callback is still there. Maybe this problem is related to the auto-discovery just happening after the plugin was already processed? (I will have a look on that, thank you very much for the pointer!)

Thank you very much @hukkinj1 for the help. I will have a look on the points you shared this week, there is probably something going wrong there, specially with the dependencies in the PKG-INFO and the automatic Description-Content-Type.

Regarding the py.typed, I also think that is a very worthy feature request. I will investigate more on that in a follow up.

I thought that leaving it in both places would be the less problematic approach. Do you think it should be different?

If I were designing pyproject.toml support for setuptools, I’d just have setup.cfg’s url field converted to a tool.setuptools.url (or possibly tool.setuptools.home-page) field. If I wanted a Project URL named “Homepage”, I’d set it myself.

What is happening in your example is a combination of 2 effects: …

While that explains why I get different behavior depending on whether the [tool.setuptools.packages.find] header is commented out, the "error: package directory ‘src/test’ does not exist” behavior seems like a bug to me (Why is it expecting a package directory named “test”? Is it because the default config excludes that package? If it excludes it, why it is a problem if it’s missing?), as does the fact that tool.setuptools.package-dir seems to be required to use attr: version specs when you say it should be automatically determined.

Thanks @jwodder, I plan to investigate more this issue later this week!

I think deviating from PEP 621 for project metadata would be confusing. Specifically, see PEP 621 – Storing project metadata in pyproject.toml | peps.python.org.

But PEP 621 doesn’t define a way to set the Home-page metadata field. You can’t deviate from what isn’t there.

It does.

[project.urls]
homepage = "example.com"

That sets a Project-URL metadata field, not the Home-page field.

The homepage field can be taken from there. PEP 621 does not specify it, backends can implement however they like. I think the best option would be to use projects.urls.homepage.

1 Like

The Home-Page metadata key is treated roughly equivalently to a Homepage project URL on PyPI. There are no other known usecases of that key.

One of the “goals” of PEP 621 was to consolidate some of the “here’s two ways to achieve the same end goal” as well as separate the metadata layer names from the overall package. So, as folks are saying here: putting it in project.urls out of the box is reasonable. I think it’s fine to add a tool.setuptools.homepage-url — that isn’t “out of spec”, it’s just that the difference between doing it that way and the way it’s done currently is too-subtle and having a smaller surface area would be helpful IMO (PS: don’t make the mistake of mirroring setuptools’ bad defaults and option names in pyproject.toml please!).

This means that we’ve omitted some keys that felt redundant, or didn’t really see much value to.

[…]

Just to clarify, there are no other use cases for the Home-Page
metadata key known to you? Can you be more specific about what you
mean by “known” in this instance? How was that statistic arrived at?

Similarly, will the License metadata key be discarded in favor of
the Trove classifiers for License? PyPI treats them roughly
equivalently, and I know of no other use cases for it.

Currently what is implemented is that projects.urls.homepage (and variants like home-page, Home-page, etc…) is being used to backfill the core metadata field Home-page. A similar thing is also done for Download-URL.

This looked to me the “safest” approach, since I conjectured that other tools could have been relying on these fields.

It is nice to known that PyPI will do the right thing if Home-page is not present, but an equivalent entry is found in Project-URL.

2 Likes

Hi @hukkinj1, I was just investigating this issue first since it seems very problematic.

To understand the state of affairs I decided to inspect build as an example (it does currently depends on setuptools). As we can see by running the following commands, it does have a series of dependencies:

$ wget https://files.pythonhosted.org/packages/py3/b/build/build-0.7.0-py3-none-any.whl
$ unzip -c build-0.7.0-py3-none-any.whl build-0.7.0.dist-info/METADATA | grep -i requires
Requires-Python: >=3.6
Requires-Dist: packaging (>=19.0)
Requires-Dist: pep517 (>=0.9.1)
...

However, this dependencies are currently not being added to PKG-INFO:

$ wget https://files.pythonhosted.org/packages/source/b/build/build-0.7.0.tar.gz
$ tar xOf build-0.7.0.tar.gz build-0.7.0/src/build.egg-info/PKG-INFO | grep -i requires
Requires-Python: >=3.6

Instead, they reside in a requires.txt inside the egg-info folder:

tar xOf build-0.7.0.tar.gz build-0.7.0/src/build.egg-info/requires.txt                          :(
packaging>=19.0
pep517>=0.9.1
tomli>=1.0.0
...

So this behaviour is just regular setuptools behaviour, not specific to the “project-metadata in pyproject.toml” feature implementation. I will file an issue in the issue tracker (or at least check if one already does not exist).


Update, the issues already exist: #1716 and 1805.

2 Likes

I’ve tried out getting my minesweeper project to work with your setuptools branch (see here).

I came across a few issues:

  • setuptools._vendor._validate_pyproject.fastjsonschema_exceptions.JsonSchemaValueException: data.urls.homepage must be url - my URL field did not start with ‘https://’, but the example in PEP 621 suggested this should work
  • setuptools._vendor._validate_pyproject.fastjsonschema_exceptions.JsonSchemaValueException: data.minegauler-bot must be python-entrypoint-reference - I was specifying "minegauler.cli" as an entrypoint, expecting it to do the equivalent of python -m minegauler.cli, but it seems I needed to use "minegauler.cli.__main__:main", is this expected?
  • I hit an issue with optional dependencies that I haven’t diagnosed yet, I may report back later
  • When building sdist I see “warning: check: missing meta-data: either (author and author_email) or (maintainer and maintainer_email) should be supplied” - this is despite the fact I have both authors and maintainers, e.g. authors = [{name = "Lewis Gaul", email = "minegauler@gmail.com"}]
  • I was hoping packages and package_data would still be possible to specify in setup.py rather than figuring out the tool.setuptools section or adding a setup.cfg, but these seem to be ignored

Other questions/thoughts:

  • I’m seeing “Install trove-classifiers to ensure validation.” - is this something that should be included in build-system: requires?
  • One other thought: when hitting these backend errors a pretty big setuptools traceback is output, which doesn’t feel very user-friendly (I only needed the exception message).

Looking forward to this functionality getting released, thanks for your work on this!

2 Likes

Thank you very much for the feedback Lewis. Most of the points you specified are definitely corrections to be made. I am collecting all these issues in a backlog!

If you add trove-classifiers to your build-system: requires it should do the validation. Let me know if it does not work.

I will have a look on minesweeper to understand what is happening. I was doing an experiment with pybind11, and it seemed to work at that point it time.

Ah in which case it could well be something to do with me switching to namespace packages at the same time or something - I’m happy to diagnose a bit further and let you know what I find :slight_smile:

2 Likes