Thanks for posting this. It’s an excellent summary and includes some examples I’ve not seen anywhere else in this discussion. The proposed mechanism looks clean and effective - possibly a little too close to PDM’s design for some people, but I’m fine with that personally.
The problem is that this reads far too much like a design document for a tool feature, rather than a standard. That’s not surprising in one way, project development scripts are a tool feature. But reading your proposal made me realise that’s exactly the problem I have with this whole discussion - we’re trying to design a feature, not develop a standard. And the feature is closely tied to tool user interface in a way that will cause problems for standardisation.
For example, a workflow tool that emphasises security would almost certainly have a big problem with allowing arbitrary shell commands as task definitions. And an environment manager that has ways of associatng environment variables with a given Python environment could quite reasonably be unwilling to support .env
files for tasks as well.
This highlights a problem with the whole “standardise what workflow tools do” approach for me. In order to use a workflow tool, you have to buy into all of the decisions that tool makes. You can’t like the task running capabilities but hate how it manages environments. You can’t love its emphais on security, but be frustrated that you can’t just define a task as a shell command. People are very keen to use standards to define a uniform way to handle certain features, but that ignores the fact that there are different tools for a reason - we hear lots of talk of “one standard project management tool” but everyone seems to think that means “everyone will use the tool I like” rather than “I’m going to be forced to use a tool that I hate”.
I’d like to take a step back and understand why we even want to standardise how development scripts are defined. Looking at the “Motivation” section of your document:
- Fragmentation in the ecosystem. Well, yes, but isn’t that just saying “there are a lot of tools competing in this area”? The fragmentation here is because the workflow tools market is fragmented, which in turn is because people want different things…
- Lack of standardization. We want to standardise because we don’t have a standard? This doesn’t explain why we need a standard, though.
- Reproducibility challenges. If reproducibility matters, require that everyone uses the same workflow tool. Again, this is simply a consequence of people having different preferences for what tool they use.
- Tool migration overhead. This is often brought up, but is it honestly an issue? How often do people change what tool they use? And if they do, is the syntax for defining tasks really the big problem? Surely learning new commands and understanding the design differences that made you want to change in the first place are far more work than simply transcribing task definitions to a new format?
- Limited interoperability. This one is the key for me, as it’s explicitly about interoperability. But for this, we need to work at a lower level - this came up in the thread about test dependencies where it was pointed out that a “test” task that says “run nox -s test” is useless for interoperability because (for example) Linux distros don’t want to depend on nox.
So of these, “interoperability” is the one that matters (to me, at least). But what are the use cases? The average user doesn’t care - they will simply run pdm run test
or nox -s test
, or whatever the project documentation says to type to run the test task. They have to know what to do to make the pdm/nox command available, but that’s part of setting up the development environment. The proposed solution doesn’t offer environment management, so you can’t write a command that runs the lower level pytest
without the user needing to set up an environment (and know what to install in it). Linux distros have expressed a need to not use the project’s preferred workflow tool, and hit exactly this problem - without knowing how to set up the test environment, knowing “how to run tests” is useless.
Conversely, it’s possible today to write a test.py
script, with script metadata defining the needed dependencies, which can be run to invoke the tests.
# /// script
# dependencies = ["pytest"]
# ///
import subprocess
subprocess.run(["pytest"])
Why is that not sufficient to define an interoperability standard - “Tests must be runnable by invoking a test.py
script in the project root directory”? Or if you prefer, “projects MAY have a scripts
directory at the top level, containing a set of Python scripts to run tasks, the available tasks can be determined by listing the directory - a task ‘test’ is defined in ‘test.py’”.
Remember - the goal here is interoperability, not “a friendly UI”.
I think we need to understand the use cases, if only to get a clear understanding of why the “named script” approach isn’t acceptable. I’m not pretending it is - it’s a strawman - but I genuinely don’t know what problems it has that the other proposed solutions don’t also have.