Projects that aren't meant to generate a wheel and `pyproject.toml`

I don’t see why it can’t remain as the [project] section. If people are using it that way and it’s working for them, is there a problem?

As a slight spin off the different section proposal, can we not satisfy the same need by relaxing the pyproject.toml file name? So those who need different dependency sets for different install modes (which I presume would be the major reasoning behind needing two sections), they can simply install from a different file, say requirements.toml. Every such file would use the same [project] section, and be differentiated by the file name.

1 Like

And I think this discussion is coming from wanting to see that happen.

And I view this discussion as figuring out what it would take to go from half-standardized to full standardized at this point as I personally feel like we have never come to a conclusion on what sort of requirements we would have for such a thing, especially if we leave the lock file part out and view this more as an input to a tool that created a lock file. But if people would rather do outside-in once there’s some lock file format people are happy with, then that seems understandable.

From a VS Code perspective, we are still planning to start with requirements files and then adapt as appropriate as more things get standardized.

If that’s the case, I’d rather see the discussion focus on what we do about all the things that are in a requirements file which aren’t requirements, as they are the part that will be more likely to cause issues.

Sorry, it appears I had forgotten most of the content of this thread of discussion. And now that I re-read the discussion I must admit that what I wrote (and that you’re answering to) was rather pointless.

Do you mean this list of options allowed in requirements.txt?

Most of those are user-specific aren’t they? We could both be developers/maintainers on the same library but want completely different settings for all those (typical examples are that I might have a different CPU/GPU combination than yours, and I might need to access different indexes than you). So I guess these options must not be written in a file that gets pushed to the source code repository.

Maybe some options like the --constraint option should be the same for all contributors.

I think it depends on your viewpoint of how one should control where to get dependencies. I personally think they should exist separately from the specification of dependencies, but that’s just me.

The things I think are agnostic of how/where you look for dependencies are:

1 Like

More broadly, of the things that can go in a requirements file:

  • [[--option]...]
  • <requirement specifier>
  • <archive url/path>
  • [-e] <local project path>
  • [-e] <vcs project url>

I’m talking about all of these apart from the second (i.e., “everything that isn’t a requirement”).

I have genuinely no idea which of these might be things people would want to be in a “requirements 2.0” specification.

But put it another way. If we’re designing a replacement for requirements files, the proposal needs to support all the functionality of a requirements file, or at least consider very carefully and provide a transition plan for anything we choose to drop. Conversely, if we’re designing something that’s a subset of what requirements files can do, we’ve just explicitly introduced a second way of doing things, and people need to be aware of both.

It’s always a balancing act. That’s why it needs discussion, and why it will ultimately need to be made into a PEP.

I would phrase it as “requirements file use cases”, not requirements files themselves. I personally don’t see why pip would ever have to drop the file format if the team doesn’t want to (it’s already tool-specific).

… for pip users. Do other tools support requirements files right now?

OK. I have no problem with that. We standardise a solution for a particular use case. In that situation, I would argue the case that the standard solution should be handled in such a way that we can drop support from requirements files for that part of what the standard does, so that we don’t end up with “two ways of doing things”. So if requirements are going into pyproject.toml, let’s make sure that we cover all the places people write requirements.

I can’t speak for other team members, but I would happily drop the requirements file format if there were standards that covered the same functionality. And for me, the worst result would be if we had to keep something in the requirements format that was also covered in a standard, just because the standard didn’t integrate with tool-specific additional functionality. For example, you can put your requirements in pyproject.toml, but how does the user specify hashes? Do they have to ignore the new pyproject.toml feature and stick with requirements files? If you add hashes to the new spec, what about specifying per-requirement config settings?

Conversely, if the new standard just allows requirements to be put in pyproject.toml, why wouldn’t pip recommend sticking with requirements files? How do we advise users when to choose one option over the other? If I’m confused, I don’t imagine a newcomer to packaging will find it obvious…

Well, things like pip-tools write requirements files. I don’t know if that’s the sort of thing you mean. Would you expect pip-tools to switch to writing to pyproject.toml? If not, aren’t we again giving mixed messages?

I’m not trying to wreck the idea, just trying to understand what people think we’re discussing. Because I don’t really know, myself, and we seem to all be talking at cross purposes.

2 Likes

Yeah, I think that’s a good phrasing.

:+1:

I think the framing around “requirements file” is tricky since they currently serve the purposes of listing dependencies, constraining what the resolver can consider in its solution, and then details on the concrete wheel/sdist that could be installed (e.g. hashes). And I don’t know if people are viewing this discussion to be about recording just those top-level dependencies as the (initial) goal, or covering all the potential inputs into the resolver, and then where does that stop and lock files start (e.g. are hashes of wheels and sdists considered a resolver constraint to control of the input, or is considered more of a way to record what you specifically want installed and thus more of a lock file concept)?

Yep, that’s a good example.

I would definitely hope so. I think my key point is I personally would want to avoid having to support everything in requirements files just because that support is there. Obviously if there are reasons to have it then great, but otherwise we are just coming up with parseable requirements files (which isn’t the worst thing, but I suspect we should at least think it all through a bit first).

My proposal is we discuss what we would want to see recorded in some file to act as input into a resolver in order to calculate what dependencies will end up being installed (and that list of what’s to be installed would probably be recorded in a lock file). And I emphasized “file” in the previous sentence as I don’t know what we think should be written down next to any listed dependencies and what should be external input into the resolver (whether that’s by command-line options, installer-specific configurations, etc.).

For me, that clearly separates:

  • What we have in pyproject.toml now for producing a wheel.
  • Whatever this discussion leads to for installing dependencies for a project, which is I view as writing down the inputs for a resolver (i.e., what pip-tools treats requirements.in as).
  • An eventual lock file which records the installation outcome of the previous line (i.e. what pip-tools produces as requirements.txt).

I think that covers both producers and consumers of packages.

4 Likes

Personally, I can imagine a compelling argument for the latter three, or at least an equivalent. I use Poetry, and offhand I don’t know how to tell it to get packages from somewhere that isn’t an actual package index, or if that’s even possible. I can easily enough find the documentation to tell it how to use a different index than PyPI, but it isn’t obvious if/how that can be changed on a per-package basis, by the project.

If there were a standard based around writing specific things in specific places in the pyproject.toml, then I would know, and I wouldn’t have to re-learn it if I decided that I don’t like Poetry any more and want to use something else.

However, the [[--option]...] bit seems pretty flaky. I assume these follow the command-line options of Pip. It’s unclear which options even make sense to include in this context, and their behaviour is presumably specific not only to Pip but potentially even to specific versions of Pip. That seems like a terrible basis for any changes to the pyproject.toml standard, because as far as I’m aware the point here is to maximize interoperability. Similarly, while it makes sense to have some kind of mechanism to indicate whether to use editable mode for installing local packages, it definitely shouldn’t look like a Pip-specific command line flag.

Overall, I lean towards seeing this as a replacement, and I think this requires preparing for some breaking changes for some users.

And given the existence of pyproject.toml, and of workflows already built around it that don’t end up with a wheel, I strongly lean towards making it the hub for all of these sorts of configurations, and trying to make it so that nobody needs a requirements.txt even if the Pip team decides they still want to support it.

(Lock files are another matter. I don’t see any particular reason why they need to be standardized, since the idea is to have the user’s preferred tooling generate it automatically from the combination of pyproject.toml and access to the package index. They seem meant to preserve individual users’ preferences, and cache the solving results of specific tools. If people want to pin exact versions for a specific use case, they could just… write a separate pyproject.toml with all ==-specified version numbers.)

Coincidentally: [Poetry] Is it possible to use poetry without defining a package? So purely for pinning dependencies for a venv

In case anyone else struggled a bit for it to “click” in their head why people have suggested reusing project.dependencies is the wrong thing for runtime dependencies for arbitrary code compared to as something to write down in METADATA, I wrote how I came to understand it in Differentiating between writing down dependencies to use packages and for packages themselves .

2 Likes

So, one purpose of the list of dependencies is to just write down some metadata that ends up in your wheel about what you need to make some package run, but that’s it; it’s something for build back-ends to write out in some file in a different format. The other is how to list what you need to be installed for your code to run; it’s something for an installer to use as input into a resolver to figure out the complete list of dependencies your code needs. One is written down in some file as-is, the other is used as input into an algorithm to expand on the list.

I have to admit, I don’t find the argument from the blog post convincing. As far as I can tell, the reason for “writing it down in some file as-is” is exactly so that it can be “used as input into an algorithm to expand on the list” on someone else’s computer. The code has the same direct dependencies whether it’s being installed for someone else or tested locally, because it has the same import statements and consumes the same API. The client might have a different pre-existing environment, such that dependency resolution comes to a different conclusion, but it’s still fundamentally the same process from the same starting point.

In other words:

It might also mean we either need to define a new table for pyproject.toml for specifying the requirements necessary to run your code or we need a new file entirely separate from pyproject.toml for the the purposes of writing down what’s necessary to run your code

I can’t see a distinction between “[metadata that sits in your project directory about] what’s necessary to run your code” and “metadata that ends up in your wheel about what you need to make some [i.e., your] package run”.

2 Likes

I’m mentally looking at this conversation as seeing if we can write down some of the inputs we give a resolver, e.g., requirements.in if you’re viewing this from a pip-compile perspective.

What sort of data are we talking about?

What are the inputs to a resolver to calculate the dependency graph (and we can use pip’s resolver as the running example)?

  • Requirements
  • Constraints
  • Places to look for the dependencies
  • Platform details (i.e., supported wheel tags, which implicitly includes Python version, interpreter, OS, and CPU architecture)
  • Stuff like requiring wheels

The bare minimum that a resolver needs is:

  • Requirements

Pip’s requirements files cover:

  • Requirements (which can refer to other requirements in other files)
  • Constraints (which must be in an external file)
  • Places to look for dependencies

What can be inferred is:

  • Places to look for the dependencies: assume PyPI
  • Platform details: assume the running Python interpreter

When you’re sharing code with others, the constants of your code that you can reliably write down regardless of your situation is:

  • Requirements
  • Python version support

I’m assuming constraints are situation-dependent.

Strawman proposal

With all of that in mind, my strawman proposal is a [run] table that supports three keys:

  1. requires-python
  2. dependencies
  3. dev-dependencies

I think the first two are self-explanatory. The third would be like project.optional-dependencies. It might be nice to come up with a way to handle inheriting from other dev-dependencies entries, e.g. anything in just square brackets automatically includes that dev-dependencies entry. So having coverage = ["[tests]", "coverage"] would include everything in run.dev-dependencies.tests as well as "coverage". Or you could make it .[tests]. I realize this isn’t currently supported in any spec right now, but I think it would help with the case where people build up dependencies using -r in requirements files, and so something to include in the PEP (or a separate PEP worth writing since without a name you can’t use the self-referential hack to get extras to refer to each other, e.g. in a project named spam, use coverage = ["spam[tests]", "coverage"]).

As written, this could go into pyproject.toml. I’m not sure if people need more flexibility in how to write out various dependency structures like you can with -r in requirements files and thus require some more sophisticated structure (i.e., a way to specify independent top-level requirements that are in no way associated with each other), or even separate files.

I’m personally okay viewing constraints and package indexes as something you pass in to the installer when calling them instead of embedding them in the [run] table.

4 Likes

A case where you’d want multiple top-level packages is running a WGSI app, as you have the app itself, and then some WSGI server e.g. modwsgi or gunicorn. The app and server are not related (in that in theory you should be able to deploy the WSGI app with any WSGI server), but you probably want to lock both versions.

Yeah I also don’t grok this distinction. Unless the idea is that the first list is “all the dependencies and their dependencies and so on”. Personally I have no desire to specify that in a project file–I don’t care what my dependencies are using under the hood, and pip will install what they need.

The distinction that I think you’re looking for is requirements for a package to run (i.e. the stuff the projects table, or in the setup.py), vs the requirements for an environment (i.e. the lock file or requirements.txt). The former are expected to compose into a list of contraints (if they don’t, it’s because there’s not an overlap, either because the packages won’t work together, or someone has been overly restrictive with their requirements), whereas the latter do not (given they’re probably set to specific versions).

I think a reason some people are skeptical of it is that in theory, you can have dynamic = ["dependencies"] and the build backend’s in charge of filling the dependencies and putting them in the wheel metadata. Or, it could theoretically add dependencies that you haven’t specified.

I don’t know if there have been many use cases for this. It also sounds like it can be solved by installing the project in editable mode (like hatch run python foo.py actually does).

@dstufft’s blog post setup.py vs requirements.txt · caremad, which is linked from @brettcannon’s own blog post, makes some arguments that sound more like explaining requirements.txt as a lock file, but a lock file arguably doesn’t belong into pyproject.toml (since it’s not meant to be human-edited).