PEP 621: round 3

I believe this answers my question — if the other backend authors who participated in the process don’t find PEP 621 useful, then it should be dead in the water.

Yes, that is the benefit I am talking about — if we standardize where metadata goes in an sdist, setuptools can do it quietly, which is why I was excited about standardizing sdist metadata. We should definitely have a conversation about the best way to standardize sdist metadata and come up with an approach that will work well so that we can realize this benefit.

I think I’m relatively convinced that we should withdraw PEP 621. IMO it will still be useful because it’s a pretty good design for a static metadata spec, and people can adopt it whether or not it is accepted.

1 Like

I find it very useful at least :slightly_frowning_face: . I’m sure setuptools and flit would adopt it shortly too.

As-is or under [tool.<NAME>]?

1 Like

It’s the backend authors who have to adapt, but other consumers (particularly PyPI) who get the benefit.

As a (part-time) backend author, I’d prefer the sdist standard to be based on the wheel metadata than the user-written metadata. That way I don’t need to deal with two different output formats (but I also deliberately designed my backend to treat sdist as just a partially in-place compiled source directory, including rewriting the pyproject.toml completely, so there’s not a lot that happens in the sdist->wheel step).

But provided I’m using a library to read/validate/write the pyproject.toml file, it’s no big deal. I’d rather not have to encode all of the transform logic between PEP 621 and METADATA though.

2 Likes

What is it you find useful about it, though? I was always dubious about the prospect that this is solving a major problem. Because this doesn’t specify how a package is built, it’s not like this makes your pyproject.toml file interoperable between different backends.

If you just adopt PEP 621 under tool.hatch or tool.hatch:project (to answer your question) you get basically all the benefits. We can even write a library that parses these things directly into some sort of intermediate object that is capable of writing METADATA files (you tell it what the root table is and it just works).

If we assume that, because it only covers metadata, the benefits for documentation, project templating and backend switching are marginal (fair), it seems that the only thing backends would be getting out of this would be the ability to put their metadata in the [project] table rather than a tool-specific table, which is not such a big deal.

There is one place where I could see the idea of standardizing metadata being useful, though, which is for tools that seek to scrape metadata directly from repositories rather than source distributions. E.g. dependabot or libraries.io or whatever. Even in a world where source distributions are standardized, that would allow tools like that to avoid unnecessary sdist builds when the project they are analyzing uses PEP 621. Without PEP 621, such tools would either need to always execute builds or just special-case tool.setuptools (and maybe add a parser for poetry and maybe flit), and not be compatible with more marginal backends.

I don’t know that we ever got any input from someone building tools like this, and I don’t know how much of an important use case they are.

I mostly like:

  1. the familiarity of fields granted by the future network effect of widespread use, allowing for a mostly interoperable pyproject.toml file. similar to [requests|httpx].[get|post|...]
  2. quite simply, I like using standards. it reduces uncertainty and avoids having to re-invent the wheel, whether that be naming, implementation, etc.
  3. as you mention, a default place to look for dependency scanners
1 Like

ahem :slightly_smiling_face:

I build adhoc tools like this a lot. But if you want the really big potential consumer of this data for me, pip is probably the one to look at. What pip most needs from a sdist is name and version (which we get from the filename) and dependencies. I imagine a bunch of dependency data will be deferred to wheel build time, but we could get a lot of benefit from reliable dependency metadata in sdists.

When resolving an install request, pip downloads distributions from the package index to get dependency information. That’s costly and it forms a major performance hit for pip’s new resolver. For wheels, all we have to do is download and extract the metadata. For sdists, we have to download, unpack, set up an isolated build environment, and call the PEP 517 metadata hook. If we had reliable PEP 621 metadata in the sdist, we could check for non-dynamic dependency data, and if it’s there, bypass all of that. (And to be clear, the build cost isn’t something that’s going to get paid anyway, it’s quite possible we will discard a sdist because it doesn’t lead to a valid result).

So I’m very much arguing that consumers are important from real-world requirements. And while I’d just as happily take a ā€œstandard sdist metadataā€ PEP, I’m frankly sick of getting bogged down in debate on that. The advantage of PEP 621 here is:

  1. It has a mechanism for saying ā€œto be calculated laterā€.
  2. It defaults to static data - dynamic is explicitly opt-in.
  3. There’s no debate over the name of the flipping file, it’s pyproject.toml.

But if backends know the data, and don’t put it anywhere because we’re waiting for the mythical ā€œstandards sdist formatā€, I can’t use it in pip or anywhere else unless the user migrates their project to PEP 621. And waiting for users to adopt the new standard is probably just as slow as waiting for sdist standardisation.

So yes, in its new form, PEP 621 offers a significant benefit for pip. Probably other consumers as well, but I can’t speak for them. There’s a cost for backends in that they have to update pyproject.toml, but honestly I don’t think that’s a big chore.

I apologise if I assumed incorrectly that people realised the above. I’m pretty certain pip’s use case was presented during the initial PEP 621 discussions, but got dismissed for the same ā€œwe’re not standardising sdistsā€ reasons. That’s one of the reasons I became less interested in PEP 621 - it was clearly only being targeted at being a common input format for backends. I was never involved as a backend developer, but as a consumer developer (IIRC, @pradyunsg and @uranusjr were in that category too¹). Fair enough, but we’ve now essentially failed to make it a common input format, at least to the extent that Poetry have said they are unlikely to adopt it for some time. So I was left wondering what’s left. When I realised that we could possibly bring back the benefits for pip, I suggested that to @brettcannon and he was willing to give that a go.

But without being a cross-backend format, and without being usable for tools that want to introspect source distributions, I’m not sure there’s enough left in PEP 621 to warrant standardising (as opposed to just the backends that want to have a common format getting together and agreeing one).

¹ And wasn’t @di involved for the Warehouse side? Surely Warehouse would be another case that would benefit from being able to read pyproject.toml for metadata known to be fixed across all distribution files for a project?

3 Likes

I don’t have much of a say on this, but I think we should be pragmatic here and just accept the current version to avoid further stagnation. It’s frankly a huge net win for backends and consumers alike.

Also, after a few years of gradual adoption by tools + projects, we’ll actually unlock the ability to accurately resolve dependencies for arbitrary platforms/environments!

2 Likes

That is not what I’m talking about. Consumers like pip need sdists with reliable metadata. I was talking about tools that parse repositories rather than any result of a build process.

There is no question that standardized sdist metadata would be useful for any number of use cases.

You can’t avoid the debate on it just because you repurposed a PEP that had broad agreement for a totally different purpose.

There’s a huge amount of debate of the name of the file, because pyproject.toml is a human-written file that changes the semantics of builds! It’s also a second core metadata spec when we already have a core metadata spec!

Slower. Which is why I said we should focus on sdist standardization if that’s what we want. It’s what I said when PEP 621 discussions started and you said, ā€œit will take too longā€, and now the result is that we designed something totally unsuitable as a standard store of sdist metadata because we never set out to design a store of sdist metadata.

Again, to be clear, no one is questioning the benefits of having a standardized store of metadata in sdists. I have already assumed that PEP 621 is not that, and was asking about people reading pyproject.toml in repositories. They would plausibly still get some value out of this even without the new pyproject.toml-rewriting scheme.

1 Like

The current version where backends need to re-write pyproject.toml? Strong -1 on that. If that happens, please take my name off the PEP. Obviously I wouldn’t be implementing this myself, so I can’t say whether or not it would be acceptable to other setuptools maintainers.

1 Like

I also find the requirement for backends to update an user provided/facing file (pyproject.toml) awkward.

And I’m not convinced it would actually be that useful:
the most obvious field crossing my mind that would be concerned by this ā€œrequirementā€ is version (with a setuptools-scm backend) but in the case of a sdist the version would also be present in the filename, so the benefit for pip is marginal.

Outside of this updating part, I’m in favour of making the provided information canonical, mainly as a way to inspect dependencies and/or license information of projects.
Plus I like the idea of standardising this ā€œboringā€ part instead of every backends painting the shed slightly differently.

1 Like

OK, before I make a decision to reject my own PEP, I want to summarize what I’m hearing to see if there’s any agreement somewhere among people.

From pip/@pf_moore, the benefit of PEP 621 is the static sdist metadata. But then setuptools/@pganssle is saying they don’t want to do it this way, period.

From setuptools/@pganssle, there’s benefit to help getting more people to write static metadata as a back-end input source. But then pip/@pf_moore doesn’t really care since they only come into the situation at sdists or later. That’s not an outright ā€œdon’t you dare do itā€, it’s just inconsequential to pip and so Paul just isn’t interested at that point.

Everyone seems to agree there is benefit to source checkout analysis. There also seems to be general agreement that a standardized way to specify how to write all of this stuff out for users is beneficial.

So here’s my blunt, to-the-point question to @pganssle: if I remove the sdist idea, what does PEP 621 need to make it good enough for setuptools to adopt it? If the answer is there isn’t anything that could make it acceptable then the PEP is dead. If there is something (i.e. all the project info fields like keywords are required or something), then we see if flit and pipenv are on board as well and if one of them signs up I think that’s enough to keep the PEP (I’m also assuming Steve will get his library support :wink:). We might need Paul to still be the delegate on behalf of back-end developers and not pip, or we ask someone else to rule from the back-end perspective.

But I think this is it. Either I get setuptools buy-in or I’m rejecting PEP 621 and moving on.

1 Like

I don’t think @pganssle here was saying setuptoools is not interested in adopting pep 621… Just that himself is not interested in doing so. It would be a major endeavor though to migrate from setup.py and setup.cfg, and given the dynamic nature of setup.py not even sure how much of it could be automated :man_shrugging: That being said setuptoools doesn’t have too many maintainers nowadays, so I feel if pip cares about making the change they should probably contribute (at least the initial) PR.

1 Like

I wasn’t expecting pipenv to get mentioned in the discussion, since pipenv does not currently involve in producing distributable Python packages, and PEP 621 as it currently stands is designed specifically for defining metadata for a distribution. So to make sure there’s no understanding here—in what form would pipenv be expected to adopt the format, if we want to?

1 Like

I’m fairly certain that was just a typo and was intended to say poetry.

1 Like

I don’t want to unilaterally act on setuptools’ behalf (particularly since, especially lately, @jaraco has done something between the lion’s share and all of the work), but to the extent that my opinion is what sways things, I plan to advocate for implementing the configuration file spec from PEP 621 regardless of whether or not it’s accepted (modulo table name). I think it’s a Good Enough Designā„¢ (especially considering that it was designed by committee…).

The only reason I’m suddenly down on this PEP is that the latest iteration changed it into an sdist standardization PEP.

If we remove the sdist standardization stuff, I can see some marginal benefits of PEP 621 being accepted, but I can also see Paul’s point that standardizing the input is not that important, since the interop benefits are kinda weak, and without the interop benefits, it doesn’t need to be a standard, since anyone interested can just adopt it or not.

1 Like

It wasn’t a typo, it was me misremembering that pipenv doesn’t produce any binary artifacts.

Thanks, and I understand you can’t necessarily unilaterally speak for setuptools (unless @jaraco also chimes in :wink:).

@takluyver what do you think of the PEP?

1 Like

I would have thought one of the biggest reasons for standardizing the specification of a project’s metadata is the benefit to users, both project authors and library users. If in 5 years’ time 98% of all Python projects had the same way of specifying author/dependencies/entry-points/etc, then less Python-literate visitors to repos would have a better time analysing a prospective package to include in their application. Was this ever the case with setup.py?

1 Like

98% (made up number :-)) or projects already have a standardised way of specifying metadata, it’s setuptools. (Apologies to flit and Poetry if that 2% I left you is too small :slightly_smiling_face:).

What’s useful about PEP 621 is that it’s an introspectable standardised way.

My argument is that introspection is more often done on sdists than on original source trees, so having PEP 621 data be reliable only in source trees is relatively unimportant. @pganssle’s argument is that making PEP 621 data be as complete as possible in sdists subsumes sdist metadata standardisation.

1 Like

If a file is considered reliable in source trees, why wouldn’t this same un-updated file also be reliable in sdist ?

1 Like

To avoid derailing PEP 621, I’ve made a proposal here for taking the idea of dynamic and adding it to the core metadata for use in sdists.

If we can agree on that, I’m fine with abandoning the idea of backends writing to pyproject.toml. (If we can’t agree on that proposal, I may have to abandon backends writing to pyproject.toml anyway, but there’s less risk of PEP 621 being caught in the fallout if we can deal with sdist metadata separately).

1 Like