Pre-PEP: Standardizing test dependency and command specification
Hello Pythonistas. I’d like to start a discussion on an idea that has been brewing in my brain for a while now. Let’s call it a pre-PEP, but nothing here is set in stone. I merely wish to present an idea and see if it sticks. If it does I am willing to work on this and make this into a real PEP.
Abstract
This (not yet a) PEP specifies a mechanism for storing test requirements and commands in pyproject.toml
files such that test runners can retrieve them and execute tests. It could look like this:
[tests]
extras = ["extra1", "extra2"]
dependency_groups = ["group1", "group2"]
dependencies = ["pytest>5", "tomli;python_version<'3.11'"]
environment = { HOSTNAME = "localhost" }
commands = [["pytest", "-v"]] # or perhaps ["pytest -v"]
Motivation
In Fedora, we like to utilize Python packaging standards when we build Python packages.
We can get information about the build and runtime dependencies, we can get information about license files, and license expressions. One thing that remains “muddy” is testing. When we build Python packages as RPMs, we ask our packagers to:
- figure out how upstream specifies test dependencies
- figure out how upstream runs the tests
We have several tools available in our toolbox, so e.g. if upstream uses tox, the packagers can opt-in to use tox to figure this out. If upstream uses extras or dependency groups or requirements.txt files for test dependencies, the packagers need to figure that out and reuse it. I would love for our packagers to have a simple way to get test dependencies and test commands. That way, each Fedora’s Python RPM package would not need to be different in this regard.
Beyond Fedora or other distributions, we believe that using this standard would allow casual contributors to easily discover how tests are executed. E.g. a developer could clone a repository and run a well-known tool (e.g. tox or a potential simpler test runner) without examining the project structure and/or CI configuration. It would also allow project maintainers to have a canonical source of test dependencies and commands that could be shared between their local development tools and CI.
Perhaps this could be seen as a replacement for test_suite
and tests_require
, which have been left out in the PEP 517 transition.
Rationale
When we brought this up in the past in various discussions, we’ve been told: “use tox” – we have tried, but unfortunately not all upstreams use tox, and asking them to use one specific tool to run their tests might not be as well received as asking them o follow a standard.
This is why I’d like to propose a standard that existing test runners could follow. For example, tox or cibuildwheel could gain support for this in addition to the existing support for their native configuration:
Example of relevant existing tox configuration
[tool.tox.env_run_base]
extras = ["extra1", "extra2"]
dependency_groups = ["group1", "group2"]
deps = ["pytest>5", "tomli;python_version<'3.11'"]
set_env = {HOSTNAME = "localhost"}
commands = [["pytest", "-v"]]
Example of relevant existing cibuildwheel configuration
[tool.cibuildwheel]
test-extras = ["extra1", "extra2"]
test-groups = ["group1", "group2"]
test-requires = ["pytest>5", "tomli;python_version<'3.11'"]
environment = {HOSTNAME = "localhost"}
test-command = "pytest -v"
Specification
This pre-PEP defines a new section (table) in pyproject.toml
files named tests
. The tests
table contains keys specified here and MUST contain at least the commands
key. It is heavily inspired by the tool.tox.env_run_base
table from tox.
tests
table keys
commands
key (mandatory)
The commands
key contains a list of commands. A test runner will execute one by one in a sequential fashion until one of them fails (their exit code is non-zero) or all of them succeed. The outcome of the tests is considered successful only if all commands succeeded.
NOTE: One command could either be specified as a list of strings to be passed to subprocess.run()
(as tox does it) or it could be specified as a string to be passed to subprocess.run(..., shell=True)
or shlex.split()
(e.g. as cibuildwheel test-command
). I see benefit of both approaches, perhaps we can support both?
extras
key (optional)
A list of names of “extras” from the package to be installed. For example, extras = ["testing"]
is equivalent to pip install .[testing]
. This key is only allowed when the pyproject.toml
also specifies a Python package, which is determined by the presence of either the project
or build-system
table in the same pyproject.toml
. A test runner will ensure the given extras are installed in the test environment before the commands
are executed.
dependency_groups
key (optional)
A list of names of dependency groups (as defined by PEP 735). A test runner will ensure the given dependency groups are installed in the test environment before the commands
are executed.
dependencies
key (optional)
A list of names of the Python dependencies. Each value must be one of:
- a Python dependency as specified by PEP 440,
- a requirement file when the value starts with
-r
(followed by a file path), - a constraint file when the value starts with
-c
(followed by a file path).
A test runner will ensure the given dependencies are installed in the test environment before the commands
are executed.
environment
key (optional)
A dictionary of environment variables to be set by a test runner when executing the commands
.
Specifications for the test runners
The kind of environment used to execute the tests.commands
is up to the test runner. It MAY be a fresh or re-used virtual environment, a container, a current Python environment where the test runner is installed, etc.
A test runner MUST ensure that the defined Python dependencies (via tests.extras
, tests.dependency_groups
, and tests.dependencies
) are installed before it executes tests.commands
.
If the pyproject.toml
file also has a project
or build-system
table, the test runner MUST also ensure the very same package is installed (either via a wheel or editable install); if it doesn’t and the tests.extras
key is present, the test runner MUST error.
If the defined Python dependencies and/or the tested package cannot be installed, the test runner MUST error.
A test runner MUST set all environment variables from the tests.environment
dictionary. The test runner may preserve or clean the existing environment as it deems appropriate (e.g. it MAY allow users to configure this behavior).
A tets runner MUST must execute tests.commands
one by one in a sequential fashion until one of them fails (their exit code is non-zero, in that case the test runner MUST also fail with a non-zero exit code) or all of them succeed (in that case the test runner MUST suceed with a zero exit code).
A test runner MUST ensure that executing python
or python3
in tests.commands
works and executes the same Python for which it installed the dependencies. It MUST also ensure that scripts from the specified test dependencies have preference on PATH
.
A test runner MUST execute tests.commands
from the Project Root Directory of an unpackaged Project Source Tree. That is, the directory with the pyproject.toml
containing the executed.
A test runner MAY support running tests specified in a Distribution Archive but in that case, it MUST extract the archive before executing tests.comamnds
.
Recommendation for projects using this
The definition of “tests” is intentionally left out of this PEP. However, it is assumed by the PEP authors that the specified tests.commands
will run unit tests (or similar) of the project. As such, the recommendations are:
- Do not include tests that require complex setup.
- Do not include code linters or type checking tests or test coverage – include tests that ensure the software functions.
- Do not include
tests.commands
that will have unexpected side effects. - Do not include
tests.commands
that would (un)install Python packages. - Do not include
tests.commands
that are platform specific. - Do not run test runners (e.g. tox) from
tests.commands
. - Do not assume a specific test runner is used.
Package Building
Build backends MUST NOT include the data from tests
table in built distributions as package metadata. This means that PKG-INFO in sdists and METADATA in wheels do not include any referencable fields containing the tests dependencies or commands.
Out-of-scope ideas
Multiple test envs and different Python versions etc. – I don’t want to standardize that. Tools like tox can still use the information from the tests
table for the “default” testenv.
What needs to be solved
Passing positional arguments – I find the way tox does it in the TOML configuration a bit clumsy. But if we go with the string commands, we can have {posargs}
:
[tests]
commands = ["python -m unittest {posargs}"]
Default dependency group – if a dependency group called tests
exists, it might be used as a default. That would make the simples use case simpler, but it might be too magical:
[dependency-groups]
tests = ["pytest>=8"]
[tests]
commands = ["pytest"]
Unspecified keys in the tests
table – Should we explicitly forbid those or allow the test runners to read them? E.g. tox could read other keys and use them, but there is a risk if a future standard adds them with different meaning:
[tests]
pass_env = ["FOO"] # only used by tox, but not specified in the standard
...
…and if the PEP forbids this, would it require test runners to ingore additional keys, or error if such keys are found?
Only allow dependency groups? – Perhaps this new standard does not need to cover all possibilities of specifying dependencies. But if it does, the simple use case is simple:
[tests]
dependecnies = ["pytest>=8"]
commands = ["pytest"]
…hence, not sure.
Platform specifics – How do we deal with path separators, etc.?