Proposal: overrides for installers

To mitigate the effort, it might be easier to add a feature to tomlib whereby a toml override file can be used to override the data loaded from a toml file. Thus, for all such projects, they would simply replace:

data = tomllib.load(fp)

with

data = tomllib.load(fp)
data = tomlib.update(data, fp)

So, this feature would boil down to writing one update function, and then tools can opt in to using that function.

Maybe Iā€™m wrong or misunderstanding something but isnā€™t

data = tomllib.load(fp)
data |= tomllib.load(fp)

equivalent and works already?

1 Like

Cool! Is that all thatā€™s needed for libraries to implement this proposal? Sorry I donā€™t have time to investigate, but if so, that would be a huge step.

Dunno about the proposal, but dicts can be updated either with the dict.update() method, using the {**a, **b} pattern, or the | and |= operators since 3.9 I think?

Sorry for the OT.

Oh, a dict update is probably not good enough. It needs to recursively update sub-elements

1 Like

Well if it is enough, and you donā€™t have time to implement it, then I guess that demonstrates that even this small amount of work is too much to attract contributionsā€¦

Sorry if that sounds snarky, but the reality here is that none of the tools needs someone to point out a simple way of implementing the requested feature - they need someone to actually create a PR that does so. A lot of discussions around packaging tools and features fizzle out at the point where itā€™s been established that a bunch of people want something, and that ā€œit looks easy to doā€, but no-one ever actually tries to actually do it. And the reality is that itā€™s only when you try to implement the feature do you find out why itā€™s not as easy as youā€™d hoped.

So the maintainers get a reputation[1] of blocking good ideas, users get frustrated that tools donā€™t care about their use cases, and everyone gets burned out until the next idea comes along. And we make no progress.

If you (or anyone else) genuinely want to progress this issue, then Iā€™ve described most of the points that I think are relevant in the post that contained this one comment that you quoted:

  1. If you only care about pip, I already pointed out the existing pip issue for this. Feel free to look at the history on that and propose a solution there.
  2. If you care about a standard mechanism that all tools will use, thereā€™s also poetry and pipenv issues linked in that post. Youā€™ll need to get consensus on all those projects for whatever solution youā€™re suggesting.
  3. Youā€™ll need to look at the other practical points I mentioned. Maybe Iā€™m being pessimistic (if your idea of simply merging 2 dictionaries is enough, then at least one of my points was overstated) but the only way to know that is to try some ideas out.

Again, my apologies if this sounds like Iā€™m dismissing your suggestion. I suppose I sort of am, but given that this thread had been dormant since last March, I think itā€™s likely to need something more than this to have any chance of actually moving forward.


  1. which we donā€™t like having, to be clear! ā†©ļøŽ

2 Likes

Perfectly fair points.

In the spirit of moving this thread forward, I think a first step would be to make a list of all of the requested types of overrides. That way we can ensure that whatever mechanism is implemented satisfies them all.

So far, we have the ones mentioned at the top (adding a section with a requirements; adding a section with an index-url), and the two I mentioned.

Adding sections seems easy. My suggestions require replacing elements, which should also be easy. There may be suggestions that require adding things to lists? Will order matter? What if someone wants to delete something from a list?ā€”or delete an entry? In that case we may need a way to specify deletions.

I think if we can narrow down the scope, we may be able to get something rolling.

Honestly, I think the best way to start would be to simply implement @sinorocā€™s original specification, linked in the first post on this thread. It may not get accepted as it stands, but working code is a far more compelling argument than any sort of debate. And even if the proposal there doesnā€™t do what you want, implementing it would give you a much better understanding of whatā€™s needed for any other proposal than simply talking about it.

For a somewhat easier starting point, just produce an implementation for one tool. While doing that wonā€™t give you a good feel for how standardising a cross-tool capability differs from implementing a tool-specific feature, itā€™s still a great start.

I think itā€™s something like this. I havenā€™t tested it or anything so please donā€™t take this to be a proof of concept. Itā€™s just an illustration of how I imagine this would work. The idea would be to bake the override process (apply_overrides in the code) right into tomllib and tomlkit, and then there isnā€™t so much code for libraries like poetry to add to their code.

  1. I canā€™t comment on what the poetry maintainers would think of this, but
  2. For pip, youā€™ll need something that works with older Pythons, so relying on a new feature in tomllib is a non-starter for the immediate future.

Couldnā€™t you just vendor the single function?

Sigh. As Iā€™ve said, prepare a PR and weā€™ll see. Honestly, I donā€™t know but I doubt it (our vendoring process vendors packages from PyPI, not bits of code from the stdlib). Copy and pasting a function definition is a possibility, but Iā€™m not at all sure I would be happy with taking on that maintenance burden (and I canā€™t speak for the other pip maintainers).

Anyway, this is exactly the sort of thing I donā€™t think itā€™s productive to endlessly discuss here. Create a PR and we can see if we can resolve any questions. Otherwise, the reason this proposal hasnā€™t been implemented is simply ā€œno-one has written any codeā€. Everything else is incidental.

I understand what youā€™re getting at, but from the other side, no one wants to invest weeks writing code only for the answer to be ā€œnoā€. I remember when I worked for months on PEP 448ā€™s implementation and the initial reaction of the core developers was to reject it. Thankfully, it was eventually merged, but it was looking for a while like all of that work was going to be wasted.

So I think itā€™s the same thing here. It helps to get an idea of whether that work will be accepted. If we had a clear positive or negative indication of whether something was likely to be accepted, it would motivate the time investment of implementing something. On the other hand, I see your point that you donā€™t want to answer questions would code that you can look at. Ideally, there would be some middle ground. E.g., working from a design document.

You could vendor the entire tomllib with the changes in place if that makes you more comfortable.

1 Like

Your best starting point is to look at the issues linked above to see what the various projects think of this idea. And getting a consensus for a multi-project standard is even harder to assess.

What I will say (because your comment is certainly fair) is:

  1. For a proposal just implementing something in pip, I canā€™t even offer certainty that I will review it. My volunteer time is seriously limited, and whether I can review a PR is strongly affected by how complex it is. In this case, I think a PR will be complex, so Iā€™m hesitant. You think itā€™ll be simple, so you donā€™t understand my reservations. I can claim that I know pipā€™s codebase better than you, hence my instinct is likely to be more accurate, but honestly, I have no wish to be that negative. If you can produce a simple PR[1] Iā€™ll try to find time to review it. But if my review consists of ā€œyou havenā€™t thought about X, Y, Zā€¦ā€, and as a result your simple idea becomes complex, Iā€™m sorry but Iā€™ll have to drop it at that point.
  2. If you want to write a standard that will be supported by all tools, then Iā€™d probably be the PEP delegate. And in that context I can say that I will review and decide on it, as thatā€™s my job. But for it to be a success, it would need to be a detailed design, with clear support from the various tools that would implement it (as well as from the community in general). Plus, at least some workable plan for how it would be implemented in one or more tools. I may comment in the discussion as an individual or a pip maintainer, but whether I do or not isnā€™t particularly important - Iā€™m OK with accepting a PEP that I have personal reservations about, as long as thereā€™s clear evidence that the community supports it and itā€™s of benefit to the ecosystem (which should have been captured in the PEP anyway).

(This discussion has probably taken up most of what was left of my open source time for today, so Iā€™ll say no more for now).


  1. Remember, itā€™ll need docs, tests, etc - just a proof of concept isnā€™t enough ā†©ļøŽ

1 Like

Fair enough, thanks for the explanation. I was just trying to move this idea forward in a small way.

I do not recall if it is mentioned explicitly, maybe indirectly, but if I understand correctly what you mean, then yes it is definitely what I have in mind with this.

No, not what I had in mind for the scope of this, but maybe it could be in scope. At least not what I would consider a priority.

That is not what this proposal is about. That would not work.

At this point, my attitude towards this idea is:

  • keep gathering possible use cases
  • hint at it here and there when I see a related discussion
  • in the hope to garner some interest and maybe find someone to champion this and get some implementation going or whatever

I encourage people to unsubscribe from this thread if they do not want the noise, which is perfectly fair. I do not think there will be any breakthrough anytime soon. If there is any breakthrough then assuredly there will be a new thread. We can also close this thread and redirect discussion to the gist instead.

3 Likes

Would you mind elaborating on what you need in order to implement this idea?

Not quite. As far as I can tell, this plugs into and requires changes to how the installersā€™ package ā€œconcretisingā€ logic works (i.e. going from a name to a resolved distribution) as well as dependency determination logic. The recursive merge of a mapping isnā€™t a thing to worry about in terms of the implementation work involved IMO.

I think this would be useful.

That said, I want to caution that we avoid doing too much design work ā€œup frontā€, before (at least) considering implementation complexity in the existing tooling.

Could we? Yes, absolutely. Would we? I donā€™t know, depends on the function. I wouldnā€™t call it vendoring at that point tho. :sweat_smile:

As mentioned, most of the complexity of this lives in a fairly-coupled-with-core-logic parts of the codebase. A function to do a recursive dict merging isnā€™t really a chunk that moves the needle on implementation story much IMO.

If Iā€™m reading the room correctly, @pf_moore is more concerned about how the core logic (that has a lot of nuance associated with it) would be updated, rather than the specific function for merging dicts. I share that with him ā€“ how feasible this idea is depends on the use cases, the specific proposal design and whether someone is willing to put in the effort to implement this in an existing tool.

And, @NeilGirdhar was looking to help move this forward a bit by volunteering to do so! I appreciate the interest however this isnā€™t really a piece that moves forward in small pushes sadly.

3 Likes

Okay, I read through the gist more carefully and I think I see why the naive combination of toml files doesnā€™t currently work.

My proposal would be to do the work in multiple PEPs:

  • A PEP to provide additional indexes in a pyproject.toml and connect them with dependencies (poetry already has this feature, which they iterated on multiple times; it may be worth comparing any proposal with their solution),
  • A PEP to allow (possibly editable) path dependencies to pyproject.toml (poetry has this already),
  • Various other PEPs (I donā€™t understand everything in the gist), and finally
  • A PEP to interpret a very simple overrides.toml that simply modifies a pyproject.toml.

This has a few benefits:

  • it breaks a large change into multiple smaller changes that can be
    • reviewed individually (greater chance of passing review),
    • implemented individually (easier on reviewer, easier to find implementers), and
    • scrutinized individually (potentially better design),
  • it motivates the poetry people to finally adopt PEP 621, which they are resisting because it doesnā€™t support everything they do, and
  • itā€™s easier to understand the pieces than a giant PEP.

Yes, this needs to be split, no doubt about it.

I am against this. Abstract dependencies belong in pyproject.toml, concrete dependencies do not. Maybe you meant overrides.toml in which case, yes, that is one of the main drivers for this proposal.

Sure, why not. Not what I have in mind, but weā€™ll see, maybe it comes naturally as an extension of what I have in mind.

That is just one aspect of what this proposal is about (But again, NOT in pyproject.toml! From my point of view Poetry got this wrong.). For me right now it is hard to see if it is a good aspect to use a starting point.

To be fair, I also kind of lost track of all the use cases I considered in this. I would need to find time to get back on this topic.

There were a couple of categories I thought of:

  • Make it possible to use installation modifiers that are currently global, per dependency instead. For example pipā€™s --pre option (which kind of triggered me reviving this thread). See also how it has been added to PDM. And the big one would be if it were possible to specify one --index-url option per dependency.

    Having all these as CLI flags would possible become quite unreadable, so maybe a separate file used as input to the installer would be better.

  • Override faulty package metadata. Typically offer a way to override eager upper caps on version constraints.

2 Likes