Python Packaging Strategy Discussion - Part 2

It’s still conceivable to meet the need of that group of users without having to sacrifice on common patterns, compare:

We could start offering such a dialogue whenever we find ourselves in unknown territory (i.e. something or other not found), and then generate that structure / metadata as needed, possibly after user input as in the screenshot.

It just needs to be opinionated. Agreeing on such things does not come easy to the packaging ecosystem, but I’d argue that things like black show that people very much value being relieved of those choices, simply because it removes so much noise from discussions / code review etc.

Even though I often find myself disliking what black does, the fact that it’s there and it works as advertised allows energy to be focused on more productive things than code formatting (which is IMO why so many projects have adopted it). How we lay out our project folders, and with what config files and metadata we populate them would really benefit from that kind of noise reduction.

So freaking what if we decide to force (for example) a src/-layout, presence of pyproject.toml, etc., as long as we make it work seamlessly and out of the box, even for those who were used to some other variation previously. With the exception of us packaging nerds, the vast majority of people simply don’t care, they just want something straightforward, well-documented & non-confusing that works.

12 Likes

Absolutely agreed. But my point is, let’s please make sure that we’re focussing on what users do, and not what we do. Currently, everything revolves around writing a (distribution) package, because that’s what the people writing the docs and tools do. But not every Python user writes distribution packages, opinionated workflow or not.

Everyone points at rust as an example of a great opinionated workflow. And I mostly agree. But it sucks if you want to write 20+ small executables as part of a single project[1] (that’s a real life example I had, with Advent of Code). That’s what you get when you don’t think of all the use cases.


  1. It’s possible that the way it sucks is that you can’t easily find information on how to do that. But that’s no better for a beginner than not being able to do it at all. ↩︎

2 Likes

Oh, that’s interesting. I actually found the UX really clear and intuitive with Cargo for doing that, and appreciated that I didn’t need to repeat information that was on the file system in the configuration and such, because of the well-defined structure.

I reckon that’s fair – IMO the main thing about trying to cover anything more in a meaningful manner is that we’ll need to be opinionated and focused on a consistent UX (rather than focusing on interoperability and a model of “each tools does its own thing”).

If you think it’s offtopic, I’m OK with taking it offline, but I do think it’s a good example of the pitfalls of an “opinionated” approach that doesn’t consider enough use cases, so for now I’ll keep it here.

How do you persuade cargo to allow you to have “day1”, “day2”, “day3”, … programs, all under the same project? In particular, I want the default to be no sharing of code, while still having the ability to have a “day1 parsing” library file. And on day 2, I want to just build and run the day 2 code/files. But, for example, I don’t want to maintain a separate list of dependencies for each day - the “project” is to learn rust, and the dependencies are the accumulated set of libraries that I’ve started to get familiar with.

That’s a very common workflow for me. In a similar way, in Python my “game probability experiments” project will have libraries like sympy, lea, and the like in it, but a bunch of independent scripts like “cant_stop.py”, “settlers.py”, etc. But my “PyPI analysis” project will depend on requests, packaging, etc, and scripts like “download_metadata.py”, “wheels_by_project_age.py”, etc. Neither will have a single “buildable artifact” that represents the project, nor will they have a single “run this project” or “test the project” action.

I can easily do this in Python, by handling the bits myself (a .venv in the project folder, a requirements.txt to record dependencies, and just run the scripts manually). But a workflow tool that says I need to define my project in pyproject.toml, and give it a name and version, and put my scripts in a src directory, and my tests in tests, just leaves me endlessly fighting the assumptions of the tool. I’ve tried various tools and hit that sort of problem - I’m deliberately not naming tools, though, because my point is that it’s a general “mindset” problem.

If cargo can do this, then maybe there’s some lessons for Python’s workflow tools in there.

6 Likes

I’m picking up particularly on the phrase “which the PyPUG is concerned with” in this.

People don’t think the PyPUG is only concerned with distribution packaging. Or if they do, then they still go there for other reasons, because there’s nothing else available. And we really don’t give them a great experience. At the very least, if that’s what we want to do, we should be explicit about our scope, and up front with the people we don’t intend to help. But at the moment, we start off looking like we want to help everyone, but rapidly lose anyone who has a project that doesn’t fit our model.

The “Overview” section starts with “Thinking about deployment”. That’s great. But if I look at the suggested questions, and try to answer them for at least one of my projects, what I get is

  • I’m the only user of my project - maybe I’ll send a script to one of my friends, but that will be rare and something I’m happy to handle if & when it comes up.
  • The software is intended to run on my PC, today. I’d like it if I could still run it in 6 months, but if I have to modify things because software versions have changed, I’m OK with that. The data the code generates is more important than the code itself.
  • My software is never installed. It’s run directly from my development directory.

But it is a project for me. It has a lifespan, it gets maintained and delivers results. It might even feed into larger pieces of work (typically in the form of data that informs a different piece of development). And as such, I need advice (and maybe even workflow tools) to help me manage that project.

The PUG now says “With this information, the following overview will guide you to the packaging technologies best suited to your project”. But I read the whole of the remainder of that overview section, and it gave me nothing - not even a suggestion as to what to read next. The two big sections are “Packaging Python libraries and tools” and “Packaging Python applications”. Even if I squint hard and view my code as an “application”, the application section is aimed at something that’s so different from what I’m doing that it’s basically unrecognisable to me.

Maybe we don’t have the resources or experience to write documentation for the sort of project I routinely do. That’s fine. We’re volunteers and we work on what interests/motivates us. I know I wouldn’t know how to document “best practices” myself - I’d very much be a consumer rather than a producer of such information. But I think we’re deluding ourselves if we think that what we currently have serves the needs of the full range of Python developers. Personally, I think we target a minority of users, although we’ve made it look like more than it really is by forcing the “distribution package” model to handle more than it naturally does. For example, console script wrappers as a means of making “distributing an application” fit the “distributing a library” model - it’s clever, and extremely effective for many use cases, but even so, it’s making the problem fit the solution rather than the other way around.

2 Likes

The location of all binary entry points is specified as src/bin/[name].rs, and you can run it via cargo run --bin [name]. src/main.rs is the “default” entrypoint, and what you get if you did cargo run with no binary names provided (if it exists).

Also, cargo build will build everything by default.

3 Likes

Sigh. Or maybe I’m just dumb and missed it :slight_smile: Although “being able to do something isn’t much use if people can’t find it” is a lesson, I guess…

1 Like

Thanks @pf_moore : you’ve articulated much of what I had in mind when I started this sub-thread. Perhaps we can define a couple hopefully-common use cases to go along with “libraries and tools” and “applications” maybe:

  • stand alone scripts
    and
  • personal libraries of functions

I think a key point that Paul alluded to is that in these cases, the details of the dependencies aren’t that critical – maybe it uses, e.g. pandas, but it’s OK if two years from now when I try to run it again, something odd happens, and I need to update the code a bit for the latest pandas.

I think the “personal libraries of functions” use case is reasonably well covered by a simple package – a really minimal setup.cfg file and an editable install works great. We just need to document that better.

Simple one-file scripts, not so much – in fact, even a couple scripts in with a personal library are a bit tricky. I wish you could just do a script, rather than defining an entry point. I’m pretty sure that setuptools used to allow that (with a wrapper exe on Windows), but apparently no more. So if you want to have a script installed, it has to be written to have a main function that can be called, and then a somewhat cryptic declaration as an entry point. So I think we could have a better story there.

As far as being opinionated about project structure – I agree that that’s OK, even for these simple use cases, as long as:

  • it’s easy to auto-generate
    and
  • there is very little required metadata – you should be able to have a package name and that’s about it.

For me this was the single biggest stumbling block. It took me weeks to understand which route I should be using, and then weeks more to implement it enough to work. I was trying to ship an application, yet it was pure python[0], so could be either, but not really in terms of letting people easily install and use it.

To this day everytime I set out about adding packaging to a new project I set aside several days to see what the current state of the art is because I know (better read as: hope) it will be different from the last time. And even if it hasn’t changed, I’ll have to almost learn as if from fresh anyway because the path is not clear.

[0] @smm, since you asked folks to say which projects, in my case it was Leo-Editor, though I’ve since relinquished the role.

Hmm,

I would have thought you’d want to ship Leo as a stand-alone desktop app – e.g. PyInstaller.

But I see now that the Leo website recommends installing Leo with pip or from source. In that case, maybe the problem was making too much of a distinction between a “application” and a “library” – AFAICT, the only difference is what kind of entry points it has (if any).

And, looking quickly at Leo’s current docs, there doesn’t seem to be an entry point installed anyway.

Which maybe points to a missing feature in the docs!

Matt can speak up if he likes, but I’d assume that they recommend installing through pip because “that’s how you install Python things”.

We have a strong culture of everything (apps, libraries, dependencies) going on PyPI, and basically no culture of standalone apps.

Yes, and this is one of the most glaring flaws in the current ecosystem. It would be cool if fixing this can be an outcome of this discussion.

I think conda has some potential in this regard, and the Anaconda distribution even includes a graphical installer that allows you to install “apps” (like Spyder). But it still can be confusing for a user who doesn’t understand when they do and don’t need to install an app multiple times in different environments, or how to use an app that’s installed in one environment to work with another environment. Again the problem is how to unify all this into a coherent and painless experience for the user.

I’m not sure who the “we” is in this case – Py2exe, Py2app, PyInstaller (and freeze before that) have been around for a long time. But if “we” is the PyPA community, than absolutely, because application distribution and “packaging” are not the same thing :slight_smile:

However, the term “application” is pretty tricky – for desktop GUI apps (and maybe desktop command line apps) fully wrapped up stand alone apps are a good way to go – but for, say, web applications, not so much. The PyPa-style packaging works fine for those (though most don’t need to go on PyPi!)

Perhaps the issue with that is that there isn’t that much desktop development going on these days.

1 Like

“We” is the community in general, and I’m trying to reflect the perception of users (rather than those of us within the PyPA), because that’s how it gets reflected to me.

I get asked frequently about this stuff, and all the packing apps are great and usually the response, but they aren’t known in the same way that pip and PyPI are known. So people rarely discover them on their own and instead try to force their way into PyPI.

In general, I’m not ever referring to “the PyPA community” in these discussions, because we know exactly who’s in that community of project maintainers, and they’re not the target audience for the work we’re doing :wink:

I agree with this, and I’d go further and say that because those types of tool aren’t well known and easily discoverable, they feel “niche”, and have (in my experience) rough edges that reflect the fact that people using them need to have a certain level of expertise to even find them. That’s a vicious circle, and it would be extremely beneficial (for the user community as a whole) if we could somehow break it.

Covering “building a simple command line utility as a standalone application” in the packaging user guide would be a great start. And no, I don’t mean “make your application a distribution package and give it an entry point” :wink:

1 Like

Fair enough – it’s been a LONG time since I was new to all this :slight_smile:

I think a big part of the problem is that not very many people are developing stand alone applications these days – it’s jsut not on folks’ radar.

Looking now, the page on Applications in the PyPA docs is incomplete – that’s not helping :frowning:

https://packaging.python.org/en/latest/discussions/deploying-python-applications/?highlight=Application

And there is no entry in the Glossary for “Application”.

I suppose I should go spend time contributing to those docs instead of kibitzing on discourse …

I disagree. People write CLI tools all the time - they just bundle them via entry points, which is good enough to stop them wanting something better.

2 Likes

Sorry – I meant stand-alone desktop (GUI) apps.

Even for command line apps, if they are intended for an audience that doesn’t know or care that they are written in Python, then the pip install with entry points is less than ideal.

Want to use my app?

  • First, install Python (make sure it’s one of these versions …)
  • Second, decide if you are going to be using Python for other things, if so, then it’s probably best to set up a virtual environment – then you’ll need to activate that environment to run this app.
  • pip install it

Now you’re good to go!

Not a great story.

Conda and conda constructor are a slightly better story – particularly with conda execute (though it doesn’t appear to be getting any maintenance :slight_smile: ) – but not much better.

1 Like

For Python CLIs, pipx essentially replaces these two steps, and you never have to worry about venvs for this use case.