Developing a single tool for building/developing projects

I maintain distlib, but there’s no feature development going on currently. Some of the PEPs it was based on became moribund (as before in packaging, there are many opinions and hard to get consensus) and the window of time I had to contribute mostly closed. I don’t think the scope of it is particularly big - the mirrored metadata is mainly to investigate/support better dependency resolution - which works reasonably well, as far as it goes, but wider adoption would depend on e.g. pip wanting to use it.

1 Like

For folks wanting to see the scope of a true “all in one” tool, the one we reference from the end of https://packaging.python.org/tutorials/managing-dependencies/ is hatch , since that covers project templating, release management, etc.

The main problem with “all in one” tool proposals is that they can only have one default behaviour, and that’s OK if you’re able to get in early into a relatively new ecosystem (as folks that don’t like the defaults will self-select out of the ecosystem entirely), but mostly doesn’t work for getting adoption in an established ecosystem.

pip was only able to pull it off because it was try to replace a specific existing tool (easy_install) that had defaults that were a long way from what most people wanted, and the switching costs for that usage model had been deliberately kept low.

By contrast, most of the tools we have now are explicitly making different core design assumptions. For example:

  • hatch: provides answers for everything Olek wanted an answer to when designing a project management process
  • pipenv: dependency management that doesn’t assume the project itself is a Python component, but does mostly assume one target environment per repo
  • poetry: dependency management that does assume the project itself is a Python component
  • pip-tools: a lower level DIY dependency management toolkit that doesn’t make any more workflow or target environment assumptions than pip itself does

If we exclude the shortcomings of PEP-517 regarding editable installs since it’s a well known subject and a tricky one, I think the biggest limitation I found was PEP-508 (https://www.python.org/dev/peps/pep-0508)). From the start, I wanted Poetry to provide “or” or “union” constraints for dependency specification, for instance requests==2.20.* || requests==2.22.*. This is not possible in the current ecosystem so it’s discourage since it would lead to metadata not compatible with the existing tooling. Where it can be used without too much of an issue is when declaring the python compatibility of a project. Basically, you can declare python = "~2.7 || ^3.4" and Poetry will format it to an understandable constraint: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*.

@ncoghlan

We had this discussion before and I mentioned back then that while the scaffolding part of Poetry (the new command) and the packaging part (the build command) assume a somewhat standard package structure (even though it can be adapted to more complex structures via the packages property in the pyproject.toml file), the rest of the commands do not and you can manage the dependencies of any Python project, being a library or an application, with it.

I am really curious to know why you keep thinking that Poetry was not suitable for applications since that’s something that I mentioned from the start.

2 Likes

Do you want to start a separate topic to have a discussion about this to see if sentiments have changed?

I don’t remember anyone ever being hostile to “or” dependencies but I do remember not knowing how to implement them.

Poetry assumes the thing being built will itself be installable with pip (whether that’s an application or library), whereas pipenv mostly tries not to make assumptions about why you want a virtual environment with Python dependencies in it (where it does make such assumptions, they’re in the vein of “you’re deploying a Python web application directly from source control”)

Both tools can be adjusted to counter that default assumption when it isn’t what you want, but there’s an ongoing level of slightly increased friction when doing that, so folks may instead choose to adopt the tool that makes their use case the default.

Thus my bringing the distinction up as an example of the challenges faced in trying to create “one tool to rule them all”: even when the categories under consideration are as closely related as “publishers of pip-installable components” and “users of pip-installable components” (with publishers mostly being a subset of users), the choice of which usage model to centre still results in visibly different UX decisions. (And that’s without even getting into the “Are binary extensions the exception or the rule?” split between the PyPA tooling and the language independent conda and Linux packaging tools)

Making tools infinitely configurable doesn’t solve that kind of problem, as all those options can in turn lead to bewildering complexity in the user experience (cf. distutils/setuptools)

At the moment, packaging.python.org is also squarely in the category of centering users of pip-installable packages, with the step up to publishing your own packages in the third tutorial introducing a relatively low tech way of doing things, rather than the full complexity of an all-in-one solution like hatch.

So the question of “How can we smooth the transition from ‘consumer of Python packages’ to ‘publisher of Python packages’?” is an entirely sensible one - that transition is definitely far bumpier than we would like.

The aspect I’m challenging is the assumption that a new tool is the right way to attempt to tackle that, versus improving the guidance we offer for linking users up with an existing tool that fits their use case.

For example, could we perhaps create a decision tree like choosealicense.com for “Choose a Python workflow”, where the terminating nodes were options like:

  • hatch: all-in-one Python project publishing CLI
  • poetry: for publishing Python components and applications that can be installed with pip (and similar Python component installers)
  • pipenv: for publishing Python applications and working on Python components
  • pip-tools: for building DIY Python publishing workflows
  • conda: for using, building and publishing Python components with complex binary dependencies
  • briefcase: for publishing Python applications that can be installed using native target platform installers

Hey all, glad to see this conversation is happening, I don’t have a lot of time to spend on this right now (many of you will have noticed I’ve been absent for awhile, not the thread for that discussion but I should be back around now) – but I will say that I tend to agree that users absolutely expect one tool and are probably incredibly confused by all of the ambiguities and subtle distinctions we constantly talk about.

I have advocated for this quite a bit and I think @pradyunsg has taken this mantle up as well since I initially brought it up but I strongly feel that what needs to happen is user research. Just like how PyPI was vastly improved via UI/UX help and actually targeting things users struggle with by researching.

I tend to think it’s very difficult to make assumptions about what things are like for users at this kind of scale without having professional help to design something that will actually gather actionable input which can then be used for iteration. And they can also help figure out answers to questions like ‘should it be in pip or would users want a new tool, many tools, etc’, are we all completely missing the point, I don’t know.

On the other hand, the result of such a study may be that everything should be in one tool. That’s quite a lot to maintain and if nobody is being paid to do that it’s not really actionable advice IMO. What I will say is that we should put this on the agenda for pycon probably, or we should consider having a separate event to discuss this and see if we can get folks like @sdispater, @frostming and @uranusjr to attend that also. Not sure how much of this has been considered as I honestly haven’t had time to catch up on the last 4 months of discourse notifications so forgive me if I am restating things that have already been talked to death.

If you look at the HN comments from Jacob’s packaging blog post, I noticed a general theme and one sub-theme that is probably a bit more tractable.

Based on the fact that I’m commenting here I think it’s probably easy to guess that the big theme was people wanting a single tool. :slight_smile: No real shock, although some people did pick up on the fact that if you don’t the tool’s opinion that you’re stuck while Python’s variety gives you a bit more freedom.

But the more interesting thing was the underlying theme of people being confused about specifying dependencies. Now I know we have for a long time tried to get the point across about the differences between packages and apps, but you won’t know that if the thing you first read on Python packaging didn’t point this out (and chances are it didn’t).

So for me, that thread was really pushing for something like poetry’s use pyproject.toml to specify dependencies so there’s a universal way to do that regardless of tool, and then a unified lock file as discussed in Structured, Exchangeable lock file format (requirements.txt 2.0?) for apps.

So maybe working towards standardizing how we specify dependencies could help us work towards more coherency between the various tools so people could switch between tools more easily? Or is this something like version specification where someone is going to tell me there’s a magical way some people really like doing it that I’m not aware of? :wink:

I know this kind of goes against what I say everywhere that I talk about pipenv and how to use it, but this distinction is only important because it means we have to work around the functionality available to us in python. For new users, writing some python code and making an application is framed as just copying some python files around (oversimplification), but when you are ready to package and distribute that code, you might have to put it in a src/ directory, add a MANIFEST.in or a pyproject.toml or a setup.cfg, setup.py, specify some stuff (sometimes in multiple places) and basically fundamentally change the structure of what you had before.

use pyproject.toml to specify dependencies so there’s a universal way to do that regardless of tool, and then a unified lock file

So npm and yarn both generate and consume their own lockfiles because they resolve dependencies differently, I assume this should be the same. That’s part of why I’ve typically pushed back against putting resolved dependencies in pyproject.toml.

So maybe working towards standardizing how we specify dependencies

I at least am 100% on board with finding a way to standardize all application metadata that the user inputs, including dependencies, package name, version, etc. Whether it’s realistic to reach consensus given the number of tools consuming this metadata, I sort of doubt. I’m guessing if this is important for the future of the language it will take a PEP and will need to address compatibility concerns from setuptools etc.

This is a pretty significant UX issue and is another piece of why I recommend(ed) hiring a UX consultant to help understand the assumptions we make about user behavior and how to take a more user-focused approach to this. Ultimately that’s the reason I care about any of these topics so I’ll be on board with anything that is reasonable and will make things better for users.

We have a separate topic to discuss a lock file format, so I don’t want to dive into that specific topic here.

We have already not hit consensus on specifying the version as some like Flit’s approach of __version__ and some want setuptools_scm and pulling from VCS. I am specifically avoiding that problem of trying to standardize everything – although I too would love to standardize the details that are the same across tools due to wheel needs, etc. – and trying to focus on dependency specification and see if people have the stomach to tackle this specific thing first/next.

That would probably require a grant request to PSF and the Packaging WG, but I don’t think anyone would object to paying someone to look into this.

1 Like

To clarify what you mean when you say dependency specification, are you looking to change the available formats (Pipfile, setup.py, setup.cfg, pyproject.toml, requirements.txt etc) and agree on a specific place to put this information, or a specific format for storing it in once it’s there, a way to move between formats, or adjustments to the dependency specifiers from PEP 508?

Yes.

Not quite sure what you’re suggesting by this comment.

No.

Sorry for the unclear remark, you have answered my questions and I think from my POV we are in the same position we were last time we had the conversation – we will adapt to accommodate whatever format we can all agree on but I am guessing there are numerous concerns about compatibility etc. I remain in favor personally of standardizing.

It shouldn’t be too difficult to manage. The build tools will control this via themselves or changing their PEP 517 entry point. For tools like pip, pipenv, and poetry I would assume they would just recognize the new input option, e.g. pip -r could recognize a TOML file or grow another flag for pyproject.toml in this instance.

This is the part I was thinking might be a challenge since tools like setuptools would likely want to avoid breaking backward compatibility but something like this would demand that it also use pyproject.toml as a possible configuration source or at least a potential dependency source. I’m sure I can’t think of all the details but at least it probably means specifying which information source wins in competitive situations. While this isn’t impossible, it’s one of the points made above about current challenges in setuptools

Since that falls a bit further from what I can speak about in terms of challenges I’ll leave that to others and just say that I agree we should be aiming at defining manifests in a single, standard way (I know this is more than what you are aiming for here) and ideally a way that doesn’t require executing code to find out what it depends on as much as is possible.

I want to start collecting existing dependency specification formats, and try to come up with a workable format for pyproject.toml by comparing them. Is there existing work on this? If not, what are the ones I should be comparing? Current I can think of:

  • Flit, setup.py/cfg, and requirements.txt (use PEP 508 strings directly)
  • Pipfile
  • Poetry

And some from other ecosystems we can take inspirations from:

  • Cargo (I’m especially intrigued by how it specifies platform-specifc dependencies)
  • Bundler (specifies markers and extras in a reverse form from PEP 508, like discussed in this post)

Not that I’m aware of, but I had a similar thought last night about us collecting how people currently do it.

  • enscons
  • mesonpep517
  • setuptools_scm

You should probably take a look at conda recipes.

1 Like

This looks interesting, thanks for the advice!

I’ve created a repo and included a write-up on dependency specification. I have fewer ideas on other metadata fields (e.g. the license field, version declaration), so any PR is very much welcomed!

3 Likes