Projects that aren't meant to generate a wheel and `pyproject.toml`

For most of that, PEP 723 and pipx run (when someone adds PEP 723 support to it)[1] will give you what you want. The only bit my idea would change is that pipx would optimise things so that it didn’t need to create as many venvs behind the scenes.


  1. Or pip-run right now, if you don’t mind a tool-specific syntax for the moment. ↩︎

1 Like

A minor comment: so far installers/build frontends work under the assumption that if the user/tooling calls it in a directory that has a pyproject.toml file, this means that the user actually wants to try to install/build that directory as if it was meant to originate a “distribution” (so it goes through the process of building the wheel as described in PEP 517).

That looks like a pretty good assumption to me, after all there is an explicit intention from the user/tooling of treating a directory like it is meant to originate a distribution when they run pip install .

I expect adding any other table to pyproject.toml to not change that.

Absolutely. In the case where a build-system.build-backend key exists, the intention to build a wheel is explicit, and in principle, none of this discussion should apply (and any new tables being discussed here should be assumed to be ignored by build backends by default).

The problem is when projects rely on the default that says when there’s no build-backend, the setuptools backend[1] is assumed.

It might be that we need to add some sort of indicator to explicitly say that a project is not intended to build a wheel. The discussion so far has assumed that the use cases are separate - there will never be a tool that has to decide between (say) [project] and [run] based on whether the project “is intended to generate a wheel” or not. Optimistically, that’s likely to be true, but I think we’re taking a risk if we don’t consider the possibility of a hybrid tool.

For example, I can imagine a linter checking pyproject.toml and warning because there’s no [project] section, even though the project is not intended to build a wheel. Or warning that there is a [project] section because it doesn’t recognise that the project is building a wheel using the setuptools legacy backend.

Maybe we simply say that the fallback to the setuptools legacy backend is deprecated. But that’s a pretty big backward compatibility break, and not a step that we should take lightly.

I don’t think this is something that would be impossible to solve, but I think it needs to be considered explicitly, because a naïve approach might result in a bad UX. Thanks for bringing it up.


  1. the legacy version ↩︎

3 Likes

I think this whole post makes sense and I find myself in this scenario pretty often (i.e. my first post in this topic). The “composable requirements” vision sounds really nice, if there’s a way to specify it cleanly. A major reason I use my current workflow is because that’s where the best tooling is[1].

One thing that I think is being skipped over (or I missed it)–is there a concrete proposal for what a [run] section actually does? I didn’t think there was enough detail anywhere to say that it implies one venv per project (I proposed that yesterday as a way to scope things, but I see now why it could cause issues).


  1. okay, and because I like to mess around with new features ↩︎

1 Like

I’m going off @brettcannon’s post here. It doesn’t say what the [run] section does, just what data gets stored. I agree that a proper PEP would have to address semantics as well.

As for one-venv-per-project, that’s not explicit, I inferred that from the fact that if there’s only a single [run] section with a single dependencies list, then it can only reasonably support a single venv. Yes, a dev-dependencies or optional-dependencies mechanism could add more, but there would still be a single “main set of dependencies” which is what I link to the idea of a “main project venv”.

Near the end of that post there’s this:

I think the upshot of the recent discussion is “yes, people do need that flexibility”. Or at least, this change should increase that flexibility rather than constraining it. The idea of a tool that’s can compose / combine venvs is really interesting.

The monorepo scenario is hairy but also useful to consider: each package within the monorepo has its own pyproject.toml to be packaged and installed[1]. They each define their own requirements, which may or may not overlap and thus be reusable across the different subprojects.

Would it make sense for the monorepo to have a top-level pyproject.toml that defines universal requirements, and then each subpackage can compose a venv with its unique requirements on top of the universal set? A tool that runs in the top-level directory looks for these files recursively, while a tool running in a subpackage only sees the requirements specified in that directory. The tool (package installer or script runner or whatever) figures out the venv(s) to use based on how it was invoked.

That isn’t precisely specified but I hope is interesting to think about, at least. One reasonable objection is “I don’t want to stick a bunch of pyproject.toml files in different directories” because that’d be messy and hard to maintain. But maybe there could be syntax that is equivalent to “use this section only in this subdirectory”

If it wouldn’t make sense for a monorepo to have a top-level pyproject.toml, I’m not sure why a directory of basically-unrelated scripts would have one either.


  1. at least typically, I think ↩︎

For what it’s worth, at least in a bazel monorepo, single-version policy is strongly recommended. Primarily to facilitate maximum code reuse without concern for diamond dependencies and mutual compatibility.

See The One Version Rule  |  Google Open Source
https://monorepo.tools/

So typically there would only be a single pyproject.toml (doesn’t matter where in the tree, but root is common) defining all dependencies in-use across the monorepo and then specific tooling like bazel would be responsible for composing targets that form a runtime environment (PYTHONPATH environment).

So a monorepo for me would be:
Single set of mutually compatible dependencies that are all correctly installable into a single environment, but that actual build/runtime targets compose direct dependencies (and their transitives) into minimal environments per target as defined by the user and monorepo tool (BUILD.bazel file)

So a bazel monorepo doesn’t have any need for a “run” table. It doesn’t really have a need for a pyproject.toml either, it’s mostly for convenience and compatibility with tooling and IDE. All that bazel really needs is a lockfile to import the third party dependency graph.

I think it’s probably fine to consider the monorepo scenario as a niche scenario for now, similar to GUI applications with OS installers.

1 Like

Hmmm, this is a nifty idea. When you say “per-script” here, are you specifically envisioning single-file-self-contained scripts?

The idea of specifying requirements per “thing-that-is-to-be-run” sounds very nice. But sometimes the thing that is to be run is a one-file script, and sometimes it may be something that is actually part of a library or collection or something like that. Then we have the question of how users decide at what point to split out these requirements from a PEP 723-style in-file block to some external place like pyproject.toml — and the related question of how a tool decides to look for the info when a script is run.

What you’re describing sounds to me like a concept of “profiles” — like maybe someone has a build profile and a test profile and so on. That would be a powerful concept indeed and could allow some useful flexibility.

Building on your idea of “a script HAS requirements”, maybe that can be combined with this idea of profiles. (I’m just going off the top of my head here so hopefully this makes sense.) Maybe not just a script has requirements but rather a script-profile combination has requirements (so that, e.g., “run this script in regular run mode” can have one set of requirements while “run the same script in debug mode” can have a different set). And maybe the same is true even for non-script things like building or importing a library. So it’s kind of like every “operation” or “thing you can do” (run a script to use it, run it to test it, import a library to use it, import the library to test it) has a set of requirements.

This could be realized in TOML as sub-tables, so you could have like [run] but also [run.test] or [run.debug] or whatever, each listing its necessary dependencies. I think it still makes sense to have one of those (like [run]) have special status as the default profile, so it doesn’t seem to me like this conception is totally incompatible with giving a specification for a [run] table, just that that specification could also be generalized to other profiles.

I’ve been wondering that too. Actually what I’ve been wondering is, when we talk about [run] vs [project], is that difference entirely covered by “pinning all dependencies vs. leaving as much unpinned as possible”? Because if it is, then maybe as you say we don’t need to force everything into the idea of a “run” section, but instead we just think of “each operation lists its dependencies, and those can be represented in pinned or unpinned form”. [run] corresponds to pinned and [project] to unpinned, but maybe there are also other cases that aren’t exactly “running” in the sense of a CLI tool, but still want to “freeze” the dependencies in pinned form (web apps come to mind).

It should be doable if you collect the requirements, hash them to a deterministic value, and then name the directories based on that hash.

That’s what PEP 722 and 723 were meant to address, correct? Or does your vision have some other wrinkle?

So are you saying you want to define all dependencies per-file and then somehow parse that out of every .py file to then gather the complete set? Is this why you don’t like the idea of a base set of requirements, because it is composed from a more direct listing of requirements?

If that is what you’re after then I think that’s a massive shift in goals and expected usage for this feature and the amount of repetition per file would make me want to go back to the “use import” idea somehow to cut down on repetitive definitions of requirements since I don’t want to write e.g. pytest in every test file I have twice (once for the import, once for the requirement).

Sure, but couldn’t you make that default set be empty if you truly didn’t want to have a single requirement be consistent across your virtual environments? When I have my test requirements I don’t want to repeat my baseline run requirements. Now if we standardized something like . always represents the current project when listed in requirements then that would help as you could allow you to say, e.g. . in your test requirements. But not having a default set which is much more verbose in specifying would make things harder in scripts.

One thing to consider is the instant this exists for things like testing, people are going to ask how to replicate it for [project]? I know this is partly how we got into this whole discussion of separation from [project] versus [run], but we may want to have some sort of guidance/expectation for those folks.

And how far are you thinking of taking this virtual environment stuff here? Are we talking standardizing the names of virtual environments so tools do this consistently and thus can be discovered by other tooling? Or is that going too far?

That’s how I have always viewed this, but you linked to my own post on this. :grin:

No, it’s more about purpose: [project] is for wheels, [run] has been for other cases. However you choose to specify your dependencies, pinned or not, is up to you. I also don’t know if this plays into a lock file discussion as that could also be managed externally.

Okay, but what is the purpose of that distinction? This whole discussion is about situations where you don’t want to generate a wheel, which obviously in in comparison to ones where you do. To me “you want to generate a wheel” is just an intermediate description that doesn’t help cut through to what the relevant tasks are for the user. The question is what do people want to do with the code. The process never ends with building a wheel, it ends with installing the wheel and then using it somehow.

Right now, yeah. But I think part of the issue is that sometimes people specify pinned dependencies in wheels because wheels are the only “official” way to build and distribute anything, even though they’re not well suited to some things people want to build and distribute. So the question for me is, if wheels didn’t exist, and we were thinking of what kinds of things people want to distribute and use, which things would people actually want to use wheels for, and how would they want to do that, and what would they not want to use wheels for, and how would they want to do those things?

Are there use cases where someone wants to distribute a project which is prototypically suited for wheels (namely, a Python library that they intend end users to install and use via import somelib), but they nonetheless do want to pin all the dependencies? Are there use cases where they want to distribute a project which is rather unsuited for wheels (e.g., a GUI app to be installed and used via the OS GUI), but they nonetheless don’t want to pin all the dependencies? It just seems like there’s a lot of correlation between those two dimensions (pinning and how the project will be used) and I think it’s helpful to see if that can be leveraged in thinking about what people want to do with their code.

2 Likes

The entire [project] table is oriented and documented to be a TOML representation of the metadata that goes into a PKG-INFO or METADATA file. You will have to go back to the previous thread(s) on this that kicked off this entire discussion for more exposition (i.e., it was already discussed a lot and you’re going to have to convince folks to throw out all semantics around required fields, etc. in [project] to change that).

I think you and I are viewing pinning from different angles. Pinning for me is creating a lock file from a set of dependency inputs. You seem to be focusing on folks who write pins by hand. In my scenario this is a follow-on to specifying your top-level dependencies and probably something you want specified externally in a separate lock file. This means that my perspective doesn’t view pinning as something to treat in a special way if you’re doing it manually, and if you’re generating them then they will be in a separate file.

That’s what I do in pipx at the moment. What I’d like to do is to maintain a list of requirements per venv, then when a new script is run, check if any existing venv supports its requirements, and if so, just use it. If not, try to merge its requirements into an existing venv (updating the venv’s list of requirements to match), which can then be used. Only if the requirements are incompatible with all known venvs would a new venv be created.

This approach would naturally result in fewer venvs, and in many cases just a single “works for everything” venv, which is what people naturally do before they get told that “use one venv per project” is how they should work. So it will (hopefully) make sense to users, as well as being more efficient.

I’m saying that’s how I think about a lot of my projects. Because of the way one-venv-per-project models work, I tend in practice to have to manually merge all the requirements into the specification for a single venv. And I’d much rather if my tools could do that for me, to save me the hassle of doing it by hand.

I don’t necessarily want to maintain a separate list of requirements for every script (unlike the PEP 723 case, there will be common elements for most, if not all, tasks in a project) but conversely I don’t want to be blocked from using library A in one script because another script uses library B which is incompatible with A.

We don’t have tools that do this yet, but I don’t want to get into a situation where we can’t have such tools because we’ve baked the idea of “one set of runtime requirements per project” into our standards.

That’s getting things backwards, in my view. Why not allow multiple sets of requirements on an equal footing, rather than making one set “special” and then advise people not to use it if they want multiple equal sets? If we need a “special” set, let’s just propose a distinguished name for it (like hatch does with the environment named “default”). This is somewhere I think we should be looking to prior art like hatch - if hatch feels that multiple environments are valuable, why isn’t that model appropriate here as well? The “run requirements plus maybe dev-dependencies” model seems very much based on the pipenv design, which I’d argue is hardly state of the art these days…

I don’t follow what you mean here. If people are struggling to understand that the use cases for [run] and [project] differ, then that says to me that [run] isn’t clearly-enough defined to have its own identity. And if people are clear on the differences, but still think there’s a need for test dependencies in [project] then that implies that it’s a valid suggestion, which we need to consider.

… and yet the [run] proposal singles out one set of requirements as the “project runtime requirements”, something that the requirements.txt approach explicitly doesn’t do (we talk about requirements.txt as if it’s a single file, but the feature allows arbitrarily many files, and many projects use that capability).

In the straightforward cases, one set of requirements is sufficient. So it’s easy to assume that’s good enough in general. But our standards need to cater for the messy, complicated cases as well. Otherwise, people will view new standards as a step backwards, and we’ll lose yet more credibility with people who are already struggling to keep up with changes imposed on them by tools starting to enforce new standards.

Linking back to what I said above about making sure people understand use cases, defining [run] by exception (it’s for the other cases that aren’t wheels) is a terrible way of giving people a concrete feel for what the purpose is. I think it’s crucial that any actual proposal/PEP comes up with a concrete, concise description of what sorts of project the [run] section is appropriate for. One that stands alone, and is not stated in terms of what it’s not.

That suggests that you view [run] as the input to a pinning process. Is that your primary motivation here, or is it simply one possible use of the section? Because people do manually pin at the moment - why wouldn’t they decide to do so in a [run] section rather than a requirements file? You yourself said that you view [run] as equivalent to a requirements file.

I think it’s important to remember here that PEPs define (in this context) interoperability standards. The intention is that a [run] section will be used by many tools, and therefore it’s critical to define a shared understanding of what the section means. At the moment, I don’t think we even have a shared understanding of what it means to “run a project”, much less how the data we’re proposing to store in the [run] section relates to that.

To give an example I’ve used before, tox and nox run project tasks. Those tasks have requirements, which are currently specified in a tool-specific way. This has been a problem at times - I know I’ve seen comments that tools (like dependabot, if I recall correctly) are hampered by the fact that they can’t read the requirements without special support for the tox format (and nox is worse, as environment setup is handled by procedural code). I would absolutely expect that any [run] style of proposal would need to be usable by tox/nox, as a replacement for their current custom format. And yet, that use case hasn’t been discussed at all, so far (at least as far as I can recall). Multiple requirement sets are fundamental to that case, as is the idea of requirement sets that don’t include installing the project itself. And furthermore, how we manage Python versions (python-requires) comes into this use case, giving us a practical use case to validate our design against.

1 Like

I’m not suggesting we throw out any semantics, just that if we want to handle “things that aren’t meant to generate a wheel” we need to think about why people don’t want to generate a wheel for their project. I’m skeptical that many people who have these sorts of projects are thinking “hmmm, I don’t want to use pyproject.toml or generate a wheel because I don’t want my info to go into PKG-INFO or METADATA”. Instead they’re thinking “I don’t want a build step” or “I don’t want an install step” or “I have multiple projects in the same directory” or “I can’t assume the end user has Python installed already” or something like that. So that’s the level of analysis I’m trying to start with.

Not necessarily. Ideally it wouldn’t have to be done by hand. One way I’m imagining things working is that the author specifies unpinned dependencies by hand[1] and then runs a freeze-like command that auto-populates [run] with a known-working set of pinned dependencies. My point in my earlier post was just that it seems to me that, whether it’s done by hand or not, people often want to specify pinned dependencies when they’re distributing something they expect the end user to run (like a GUI app or maybe a web app) and typically not when they’re distributing something they expect the end user to import. So there is a correlation between “are my dependencies pinned” and “what is the user going to do with this bundle”.

Totally agree, that’s why I think it’s good to have lists of use cases and scenarios (like the one you provided a few days ago) to orient our thinking. This also leaves open the possibility that different use cases may wind up requiring distinct non-wheel solutions. There may be multiple different purposes.


  1. or “implicitly by hand” a la poetry ↩︎

2 Likes

This is very much the discussion I am trying to have as well. I’d like to understand what people are trying to do, not what they are not trying to do.

My personal use case is exploratory programming. Maybe I have some data I want to analyse, or a library I want to experiment with. I’ll write a number of programs, often with not much in common beyond the basic thing I’m exploring. Some might just use the stdlib. Some might use one or two common libraries like requests or packaging. Some might be complex processes involving parallel downloads, async processing, database access. Or anything in between.

I want to be easily able to develop and run the various programs. I might end up with something that could be used standalone, and if so I’d want to be able to easily extract it from the project into its own project directory. I might drop the project for a while and come back to it later. At that time, new versions of my dependencies might exist - I typically want the latest of everything unless that breaks something, at which point I might prefer to be able to reproduce what I’d been working with previously.

I have no expectation that tools or standards could magically handle all of this for me, it’s very adhoc (and if I am being honest, a bit of a mess, usually). But I don’t want tools or standards, or even “recommended practices” that actively get in the way of this type of approach. And I fear that the move towards “a project has its own directory and a single configuration and environment” will make this sort of work a lot more frustrating, as it will be full of papercuts where you find yourself “fighting the tools”.

This is interesting, because when Brett pointed out that he sees [run] as similar to requirements.txt, I did immediately think that this implied that tools like pip-compile might view [run] as where their output should go. I didn’t follow up on that, as I think it’s clear that pip-compile should be outputting a lockfile, and [run] should be the equivalent of the requirements.in file. But the fact that you had the same thought does suggest that there’s a potential for confusion here, which is a problem.

3 Likes

Because of the desire of defaults to (potentially) simplify the common case. Imagining a py run scenario, would you expect someone to always specify the dependency group they will typically use, e.g. py run -r default ..., or be able to leave that out when they don’t need it so they are just typing py run ...?

That’s totally fine by me. The key suggestion from me is that there is some way to say what tools should default to using when running code to avoid having to always ask the user what they want installed into their virtual environment.

That’s what I’m getting at. Right now, if you have a list of test dependencies (e.g., pytest), you either have to specify it outside of your pyproject.toml via a requirements file, hope you use a workflow tool that has built-in support for this concept so you can specify it in a [tool] table, or you use an extra like test (as specified in the core metadata as a reserved extra name). I believe that last suggestion is one some folks have not loved due to it leaking out into the UX of the wheel, hence Adding a non-metadata installer-only `dev-dependencies` table to pyproject.toml . So my point is if we come up with a way to have people specify such things in this scenario, then I suspect the [project] users will also want a solution.

A possible use case.

I can’t speak to Tox, but based on my use of Nox I don’t think that will be an issue. For instance, these days I put my test requirements in a “test” extra in my pyproject.toml and then have Nox do session.run(["pip", "install", "-e", ".[test]"]). That keeps the test requirements in a centralized place that any tooling can access, whether that’s Nox, CI, VS Code, etc.

I’ll explain what I’m trying to do from the angle of a Django app, but it isn’t Django-specific in any way.

I want a way to write down the dependencies my Django app has. I can then zip up the code and deploy it to my cloud provider who can read those requirements and install my dependencies on deployment (this helps avoid having to ship my wheels, especially if I e.g. develop on Windows but deploy to Linux). FYI I’m ignoring running e.g. pip-compile to pin my dependencies in this discussion.

I also want a way to list what is necessary to run my test suite. This should be accessible by Nox, CI (although I personally use pipx run nox), and VS Code when it wants to create an environment for me. These test dependencies implicitly include what’s necessary to run the code.

I also want to be able to specify my linting tools (which includes formatters) so that I can run them in CI and also have folks run the same tooling locally. I may also want to be able to run them from VS Code in some way (e.g. tasks, using a specific version for the matching extension, etc.).

That’s my view as well.

Thanks. I routinely forget the “web application” use case, because I almost never write web applications. I agree that this is very much in line with the idea of “running a project” - to the extent that I’d suggest that the problem here might be that too much weight is being given to that particular use case.

But doesn’t a web app need a name and version? And a license, and an author/maintainer? We’ve focused on requirements, but just having a [run] section with requirements doesn’t consider those items. So do we end up duplicating a big chunk of [project] for the webapp case? And then we’re back to the problem that most of that data is useless for a data analysis project (for example), so what do we do for “projects that aren’t meant to generate a wheel or be run as a webapp”… :person_shrugging:

That’s all distribution metadata, so if I don’t share my web app code with anyone else then none of that is necessary. I assume you don’t add all of that to your personal scripts or any other internal code you have at work that you never ship outside of your company. This is why, for instance, a practice in the Django community is to have a requirements/ directory and then various requirements files in that directory like dev.txt, prod.txt, test.txt, etc. that represent the requirements for various situations (and they don’t necessarily inherit from each other; this is why VS Code’s “Create Environment” command asks you to select the requirements files you want instead of picking just one).

I wouldn’t assume I deploy my code to the cloud via a wheel; assume I used some tool provided by my cloud host which slurped up my files and copied them to the server for me somehow in a generic fashion and all I had to do was tell them the command to run to launch my web server (I think Heroku did this), or I send a plain zip file that gets unzipped and I have a config file to point out my entry points (like Azure Functions).

Correct.

Yes whether I’m right or you’re right in terms of that distribution metadata being important. This is why the original conversation that started this topic, if I’m remembering correctly, was asking to weaken the requirements around [project] so it wasn’t wheel-specific. But that then led to thinking that “[project] is meant for package metadata” and thus this topic was started.

You end up duplicating dependencies, optional-dependencies (although that doesn’t cover the linters-only scenario), maybe requires-python depending on whether your cloud host would read that to determine what Python version to use to run your code for you in some platform-as-a-service (PaaS) situation.

If we are going to say “[project] is for packaging metadata”, then I think the question we are asking here is “how do we specify what to install for some situation?” I think what you’re saying is there can be independent situations in the same workspace/pyproject.toml (the linting tools example), and I’m saying there can be dependent ones as well (execution and testing examples).

Maybe we are focusing on the “run” part too much and we should be thinking more about an [install] table? Or, and this might be too radical, we consider breaking out project.dependencies, [project.optional-dependencies], and project.requires-python and somehow let [project] reference those as appropriate, so that the concept of what should be installed is a separate concept all together in pyproject.toml? That would let [project] stay packaging metadata, but also separate the “common” things related to code in general.

1 Like

That may well be the case.

That made me think you meant installing the project, and I can easily think of things I’d call a “project” that would never be installed, but which would have dependencies. But then I thought, do you mean “this is a list of things that need to be installed”? That doesn’t fit well with requires-python, though.

We’re still skirting round the problem that we don’t actually have a good list of use cases - everyone has their own idea of what sort of “project” they are interested in, and as a result we end up talking past each other.

I don’t think that’s too radical, no. It’s likely to be seen as frustrating churn for projects that do generate wheels, as well as for backends that consume PEP 612 metadata, so we’d need to be careful about how we handle transition. But if we’re expanding the role of pyproject.toml this radically, we need to be prepared to look at radical solutions.

This highlights the fact that pyproject.toml was very much designed as a build system configuration file. Clearly, the flexibility of the [tool] section has resulted in it growing way beyond that original purpose, but the core sections were never designed with anything other than build system use in mind. So we’re in an awkward position now - if we want to expand the role of pyproject.toml, we either accept that we need to rethink the [project] table (I think it’s safe to assume that the [build-system] table is solely about build systems :slightly_smiling_face:), or we accept that there’s going to be some uncomfortable overlaps between “project that builds a wheel” configuration data and “other types of project” configuration.

At this point, we’re very close to the “should we be designing the ideal packaging solution for the next 10 years” debate. While I’m not a fan of letting every proposal get bogged down with long-term concerns, I do think we should be cautious when fixing one not-broad-enough design, to make sure we don’t repeat the same mistake a second time…

I don’t have a solution here. I’m mostly suggesting that we need to explore the problem space better, and resist the temptation to rush into a solution. And yes, I’m aware that might impact the provisional acceptance of PEP 723. I’m sorry about that, but I think it’s necessary.

3 Likes

One thing I don’t quite understand in this discussion: A project that isn’t meant to generate a wheel will not have a [project] table, but e.g. a [run] table instead. Why does it need to be a new table, instead of merely saying that some metadata such as name and version is required for building a wheel?

3 Likes

Because the implications of changing the rules on what is optional, etc, particularly in such a context sensitive way, are complex to the point where it’s almost certainly a bad idea. There was a reasonable amount of discussion earlier in the thread, which you should probably read if you haven’t already.