Sdist idea: require `pyproject.toml` and PEP 518/517

I think this idea is pretty self-explanatory: be able to build wheels from an sdist using PEP 518 and 517. This would give sdists a consistent interface to build a wheel from them.

4 Likes

Doesn’t this basically just mean removing the fallback assumption of setuptools when no other backend is specified?

I believe the implication is non-PEP 517 setuptools projects would no longer be called “sdist” but something like “legacy sdist” or “source archive” etc. PEP 517 setuptools projects does not change since they already should contain a pyproject.toml.

1 Like

That’s consistent with what I said, but if we’re going to rename “sdist with pyproject.toml” and “sdist” to “sdist” and “legacy sdist”, what actually changes?

Unless the implication is that pip doesn’t support “legacy sdists”? In which case, that’s the same as saying “we no longer assume setuptools when it’s missing pyproject.toml”, yeah?

I think pip will support legacy sdists, at least for a (long) while. The difference this makes is that, combined with other ideas, pip would be able to infer useful information from a modern sdist without actually downloading, unpacking, and building the archive. pip currently needs to download, unpack, and run get_metadata_for_build_wheel() o get dependencies; the happy path can be drastically improved by downloading only {name}-{version}.sdist.pyproject.toml from the index (similar to pypa/warehouse#8254) and inspecting the static dependencies field in it. To achieve this, we need to:

  1. Have a way to mark a file as modern sdist in the file name, so an index knows when to expose metadata file (PEP 625)
  2. Guarentee what the modern sdist has (this proposal), and how to get it (sdist archive format as discussed in the PEP 625 thread; pyproject.toml would be in the archive root)
  3. Decide on how and where sdist can declare static metadata (Sdist idea: specifying static metadata that can be trusted and if we go with pyproject.toml)

pip would need to implement fallbacks in every step for legacy sdists, but alternative package managers can choose not to support some or all of the legacy stuff. This would provide one puzzle piece for the whole process.

1 Like

I hate to be the wet blanket on all these ideas, because I do like the idea of progress, but I think this is another situation where I think we need to take into account that this is something that will impose real adoption costs, and won’t buy us much.

If we make “you include a pyproject.toml with a build-system table” table stakes for building modern sdists, we’ll just have a ton of projects that don’t generate modern sdists for a long time to come. For a lot of things like PEP 625 and standardized metadata generation, we can opt almost everyone in to building new-style sdists overnight because essentially all the work will be done by the backends. Anything that requires action by packagers will drag out the adoption of the standard.

If we think that end users will want new-style sdists so much that they’ll adopt PEP 517/518 just to get it, I guess this is a pretty minor thing to do, but I think standardizing sdist metadata is important enough that we shouldn’t try for a “blank slate” design, we should try and deviate as little as possible from things that can be achieved with little to no action from end users. Whatever benefits pip would gain from knowing that modern sdists use PEP 517/518 would likely be completely swamped by even an extra 10% or 20% of sdists having reliable dependency metadata.

4 Likes

I think we also agree, though, that long term we want everyone to adopt PEP 517/518. So that makes me ask two questions.

One, what carrot are we going to provide to drive this if we are not going to tie into everything we do? We can hope for organic, but we all know that updating one’s packaging code is the last thing people modernize. So without driving more people explicitly towards it how are we going to motivate this? Having pip some day say they simply drop implicit setuptools support?

And two, how are we going to update curds in the future for when tools are ready to drop assuming setuptools when PEP 517/518 are not followed? Once again, is this going to be a per-tools thing and so e.g. pip just decides when they want to require pyproject.toml for curds?

If we want to go with a tools-driven transition and not a spec-driven as a general rulle then that’s fine, but I personally would want the pip team to buy into that plan as how we are going to drive people to modernize their packaging story long-term.

I agree that we should switch to PEP 517/518 builds as the only supported builds, but for the purposes of adoption I don’t necessarily agree that it makes much difference whether the build-system table is actually specified or not.

If we want to completely deprecate projects that don’t have a pyproject.toml, we shouldn’t do it as part of building a new sdist, we should do it through a normal deprecation rollout. If we really care about this, then probably the right thing to do is for pip do all builds as PEP 517 builds, defaulting to the current defaults, then have setuptools deprecate and eventually remove the build_meta:__legacy__ backend.

As for what “carrot” we can provide — I don’t think that “you can use the new source distribution format” is much of a carrot anyway. Even more so than with pyproject.toml, most of the benefits of using the new source distribution accrue not to the original project but to the ecosystem as a whole, and to tools that will be allowed to make simplifications.

Yes, I think that the specification of the source distribution and the specification of what it means to be a buildable Python project do not need to be connected in this way, so we don’t need any sort of update. Either the source distribution contains a project that pip and other build tools can work with — meaning one with a pyproject.toml if the build tools require that — or it doesn’t.

Yes I think this needs to be tools driven. I think people won’t see “we can generate new-style source distributions” as any kind of benefit, and in fact will see it as a cost (wait, another new format? How is this different from eggs? Do we need to migrate away from wheels now?). I think a tools-driven approach is easier to sell: “We’ve been telling you about PEP 517/518 for years, and we’re doing a long, hopefully responsible roll-out, but we simply cannot continue supporting the old ways indefinitely.”

I’m struggling to follow what’s being proposed here. What plan, exactly, do people want the pip team to buy into? Personally, I don’t have much confidence that pip can drive adoption of new standards - we have too much backward compatibility pressure to be the first project that drops support for legacy code.

That the way forward in encouraging people to adopt PEP 518 & 517 is via pip eventually saying, “we don’t support any other way to build a wheel from a sdist/curd/whatever”.

In which case, as a pip maintainer, I’d rather that other pressures/benefits were what encouraged users to adopt PEP 517/518, and pip dropped support for legacy approaches when a sufficient number of projects had converted. That’s how pip normally moves forward.

Other pip maintainers may differ - but I personally don’t have the energy for that sort of fight.

My original hope was that “being able to use other build tools like flit” would be the carrot to drive adoption. But I don’t see many signs of that happening yet…

Well, I think there’s 2 pre-requisites to this:

  1. Making the quality of implementation of PEP 517 / 518 in pip better.
  2. Ensuring we have a reasonable replacement flow for all the other existing installs-flows that pip supports (setup.py install, setup.py develop, etc).

This means, in terms of standardization today, our “blocker” is the editable installs standard. Until that is finalized, pip can’t switch to PEP 517/518 by default.

I’d say the adoption of PEP 517 / 518 is primarily hampered by these three things: the sheer performance regression when adopting PEP 517 w/ pip, the complications with compiled packages and pip’s isolation logic today, and lack of editable install support.

Once these 3 are solved, I’m on board to have the regular deprecation process for pip for all the non-pep517 installation flows and to straight up deleting the code that supports them (for good).

3 Likes

Instinctively, I don’t think we’d want to do this. I don’t think the __legacy__ backend is in any way related to the problem at hand. And I don’t see why we should want to drop support for basically all of PyPI packages that was pre-2017 (or whatever is the relevant date).

The issue for PEP 517 adoption was not the lack of a carrot, but that our only available implementation of PEP 517 is painfully slow compared to the status quo, creates new issues for anyone who’s using a-bit-more-than-Python and in general has not evolved despite us having learnt a lot of new things over the course of the initial adoption.

At this point, I don’t think spending our churn budget on pushing for more projects to switch to PEP 517/pyproject.toml is a good idea - not until we spend resources improving the underlying issues that are preventing users from adopting it (or causing workflow issues to those that have).

1 Like

Agreed. Somewhat related - one thing I think we (the “packaging community”) could do with improving is our feedback loops for informing standards updates based on user experience. Currently it feels like we don’t have a really good way of collecting such experience from tool implementers - we mostly just rely on the fact that the folks building the tools are generally involved in standardisation efforts, and hope that’s good enough.

2 Likes

PEP 517 is currently Provisional. If I understand the term’s definition correctly, PyPA is currently gathering experience from its core packaging tools. Anything that uses the PEP must be prepared for breaking changes. I’ve been labeling such tools as “beta” or “provisional” or otherwise dangerous. That’s possibly scaring people away.

I think we got to the point where we can make PEP-517/8 final. @pf_moore? It’s unlikely we’ll drop anything from it at the moment, and we can do changes in follow-up PRs, not?

I think that doing a quick “any problems before we make PEP 517 final” survey would be worthwhile, but yes, basically. I couldn’t find a specific process for moving a PEP from provisional to final, so unless anyone feels otherwise, that’s what I’d suggest. I would like someone to formally propose that we move the status to final and manage that review, though. In theory I suspect it should be the PEP authors (@takluyver and/or @njs) but I’m OK with anyone else doing so, as long as the authors don’t object.

This is definitely working to the extent that people are using other tools, and distro packagers are starting to adopt at least some form of PEP 517 as a result. For the people using setuptools, they don’t have much incentive to affirmatively adopt a pyproject.toml file, and they are hopelessly confused as to why they need to or would want to.

The only carrots we could offer those people are things like, “Build isolation is good” and “you’ll get the latest version of setuptools”, but that could easily be achieved by flipping the switch to defaulting to PEP 517/518 on every build, regardless of whether or not pyproject.toml is present, and we have more incentive to flip that switch than end users have to opt in to PEP 517/518.

What do you mean by this? Editable installs don’t build wheels and whether or not they work is orthogonal to PEP 517/518. Many of my projects are setuptools projects with a setup.py and all of them have a pyproject.toml. The ones with a setup.py work just fine with pip install -e and with pip install. I don’t see any real reason for blocking the switch to PEP 517 by default based on the fact that pip install -e can’t use it.

It’s related to the problem at hand in that if we switch to PEP 517 by default and drop support for the __legacy__ backend, the pyproject.toml with a build-system table will become required by default.

Personally, I don’t care if people explicitly specify their build system in build-system, as long as it’s unambiguous what it means. My point is that if we care about that, we should do it by saying, “This won’t work anymore” instead of “You won’t be able to use this new thing that we care about but if you are the kind of person who doesn’t want to add a pyproject.toml you also probably don’t care about this.”

I think it’s fine for our newer features to build upon one another and start requiring previously added new features, and using the sheer weight of all of the stuff a project misses out on by not modernizing as a carrot. To put it another way, there is no single feature that is going to work as a carrot for everyone, but if we build our features on top of each other, then the chances that a particular project cares about one of the features that is gated behind the “new one true way to do X” increases with each new feature, In addition to that, each new feature that we make independently usable ends up making defining what exactly a python package could look like, a combinatorial problem where a Python package could exist with every combination of feature turned on or off, making it super difficult to actually have a comprehensive test suite, and making it more likely that there are unforeseen edge cases arising from unexpected combinations of legacy and new behaviors.

On the flip side of that, sometimes the ecosystem wide benefits, or the benefits to us as a standards body or tool authors are great enough that it makes sense to both pay the cost of additional complexity and give up a potential additional carrot (or forcing function if you like) in order to make the adoption costs of this specific feature much lower (or even be able to opt people in by default).

It’s a balancing act, and neither answer is correct in all scenarios, but I think it’s a reasonable default position to say that new features/standards should build on and expect the use of other new features/standards where it makes sense to do so, but be open to finding ways to make some features independent where that is more helpful.

1 Like

This is a great point, and one that I agree with (I also agree with the “it’s a tough balancing act” part as well). To me, though, it only makes sense to apply this when one thing naturally builds on another. In this case, I don’t think we actually get very much out of this assuming we follow the pre-existing course where pip will eventually default to using PEP 517 even in the absence of a pyproject.toml (since it seems like the main reason to require pyproject.toml is to make sure that the project can build using PEP 517).

On the other hand, I think that we’ll gain an enormous advantage by having at least the zero-point-oh version of standardized sdists be as close as possible to “standardizing the status quo” (minus any super heavy legacy baggage).