Modernising my packages - am I thinking about this all wrong?

I realized this didn’t come across as I intended—it wasn’t meant to be a recommendation against shipping sdists on PyPI, but rather that ideally users consume wheels, while still providing sdists. In fact, several of my later points rely on the latter when I’m explaining why the project configuration format package authors use is still important, because it is still completely relevant when users install from sdists. In particular,

But I definitely realize how that other line could be interpreted to imply that you shouldn’t ship sdists, which was not what I meant. I’ll update it.

Coming from a package author/upstream perspective, I don’t have a problem with doing so, but what I’ve had a hard time understanding is why downstreams don’t just use the source tarballs, as they are the definitive source form of the project, whereas the sdist is nominally for user consumption. I can see why that is the case for special cases like @pf_moore where very restrictive corporate policies are in place, but not for Linux distro downstreams or other open source projects I have more inclination to spend my volunteer time supporting.

1 Like

(post deleted by author)

By C.A.M. Gerlach via Discussions on Python.org at 24Mar2022 02:05:

[… detailed response …]

I just wanted to post to thank everyone here for the near immediate and
helpful replies. I’m running down the Tutorial, which seems more
explainatory than I’d thought, to clarify what to update in my
processes.

Thank you all,
Cameron Simpson cs@cskk.id.au

2 Likes

Thanks for the clarification @CAM-Gerlach !

1 Like

By Cameron Simpson via Discussions on Python.org at 24Mar2022 22:01:

I just wanted to post to thank everyone here for the near immediate and
helpful replies. I’m running down the Tutorial, which seems more
explainatory than I’d thought, to clarify what to update in my
processes.

Well, I spent a big chunk of the weekend on updating my release script.
The tutorial was, while providing a bit more context than I’d expected,
still a tool specific recipe for a simple setup.

I still feel that PyPA lacks a “package release overview” document which
outlines the steps involved and their purpose, in order to provide
context for the specific mentioned in places like the tutorial and a
mental framework on which to hang individual documents like the PEPs.

So I’ve written what I would like to have available, ideally listed just
above the reference to the tutorial:

https://github.com/cameron-simpson/css/blob/pypi/doc/pypa-the-missing-outline.md

Are you folks open to additng this, suitably revised for correctness?

The objective is the flow: what to do, and why.

Cheers,
Cameron Simpson cs@cskk.id.au

1 Like

Some content suggestions:

  • Mention that you need setuptools >= 61.0 for pyproject.toml support (maybe a version for each of the backends?)
  • Include references to further reading for package data-files, extension modules, testing, CI
  • Rename “Upload Artifacts” to “Build Artifacts” (or similar): the former to me sounds like an action, not a thing
  • The default for build is to build an sdist, then use that to build a wheel. To me that sounds more resilient (passing --wheel builds the wheel directly from source, not testing the sdist)

How does this improve the packaging story over Packaging Python Projects — Python Packaging User Guide? That seems to provide all the steps you provide, while being more in-depth and including more steps. Is it that you think it’s too detailed? Is it because it has mainly setuptools-specific configuration and examples (I agree: now that setuptools supports pyproject.toml, it should be updated for generic backends)?

By Laurie O via Discussions on Python.org at 29Mar2022 05:49:

Some content suggestions:

  • Mention that you need setuptools >= 61.0 for pyproject.toml support (maybe a version for each of the backends?)

Bumped. I don’t know enough about other tools to recommend versions;
suggestions welcome (with terse reasons, maybe).

  • Include references to further reading for package data-files, extension modules, testing, CI

I’ll see what I can dig up. This is not meant to be filled with detail -
being swamped with detail in a bazillion separate documents was what led
me to the conclusion that there’s no concise overview, which this is
meant to address.

I came to this wanting to update my ancient setuptools
setup(lots-of-arguments) incantation to modern approaches. And found
myself bogged down in PEPs which were both specific and vague and all
the other documents. I spent hours in that swamp.

My problem was that there was no big picture: what is required, with
what pieces. And links to the specs for the pieces.

  • Rename “Upload Artifacts” to “Build Artifacts” (or similar): the former to me sounds like an action, not a thing

Ok.

  • The default for build is to build an sdist, then use that to build a wheel. To me that sounds more resilient (passing --wheel builds the wheel directly from source, not testing the sdist)

It is; I build that way myself. I wanted to talk about the source and
built distributions separately though, so the incantation is specific to
the topic. I figure someone setting this up will at the least see the
help text for “build”.

How does this improve the packaging story over Packaging Python
Projects — Python Packaging User
Guide
?
That seems to provide all the steps you provide, while being more
in-depth and including more steps. Is it that you think it’s too
detailed? Is it because it has mainly setuptools-specific configuration
and examples (I agree: now that setuptools supports pyproject.toml, it
should be updated for generic backends)?

I had a run at the tutorial again after the original post on this topic,
and found it wanting (for me) as before. It does talk a bit about what
it is doing.

It’s ok to bootstrap a single package with a specific tool (setuptools)
in a specific opinionated layout. Which is great for the new packager
with no opinions who just wants their package out the door and on PyPI.

However, I’ve got opinions and my repo is not much like the example. To
make my setup use the modern approach I need to understand what all
the bits are for and where they sit. And that overview is not apparent
to me at the PyPA site.

So this document is supposed to:

  • be short - a layout of the flow to distribute something, describing
    the pieces and their relationships
  • not be a tutorial; there’s a nice tutorial already
  • with some examples but not prescriptive except in the sense of
    prescribing “you need to make some distribution files”

So:

  • it has a point list of the objectives, starting at the author and
    arriving at the end user
  • it goes over each of those points in a little detail after the main
    list, to make them clear
  • it has some references to the places where relevant things are
    specified (PEPs, tools)

My remark above about the PEPs being both specific and vague? 517 and
518 were the ones that particularly gave me that impression. They are
written for people who already understand the larger picture and the
existing ecosystem. I can go to them to find specific information, but
they taught me little as a basis of “what do I need to do?”

This may sound like a litany of complaint, but my core issue is lack of
a concise overview. With an overview I know what needs doing, and what
things do. What the various bits of pyproject.toml are for.

'Soup: This is the one that Kawasaki sent out pictures, that looks so beautiful.
Yanagawa: Yes, everybody says it’s beautiful - but many problems!
'Soup: But you are not part of the design team, you’re just a test rider.
Yanagawa: Yes. I just complain.

Cheers,
Cameron Simpson cs@cskk.id.au

2 Likes

I like this! I think it’s a good overview, avoids getting into details (which as you say, is what you want if you’re just looking for a feel for “how everything works”) and covers the main points well. I’d love to see this as part of the packaging user guide. I’m not directly involved in maintaining that document myself, so it’s for others to ultimately approve it, but it definitely has my +1.

There’s a few terminology points where your wording seems slightly unusual to me (as someone used to the packaging ecosystem)[1]. But nothing significant, and if this does get incorporated into the PUG, terminology can be tweaked as needed.


  1. Nothing specific, just a general feeling that the document wasn’t written by a “native speaker of packaging” :wink: ↩︎

4 Likes

Overall, it looks very helpful. I’ve noticed a handful of specific points, but haven’t gotten around to writing it up yet, sorry.

I also noticed some specific terminology points of confusion, specifically around being clear when you mean project vs import package vs. distribution package, which are all quite different things, as well as a few smaller nits, e.g. using “build backends” and “fields” vs “keys” vs. “options”, but that’s not too hard for us to help clean up. The PyPA glossary as well as the PEP 639 Terminology section might be of help here.

By C.A.M. Gerlach via Discussions on Python.org at 29Mar2022 21:20:

I also noticed some specific terminology points of confusion,
specifically around being clear when you mean project vs import
package vs. distribution package, which are all quite different
things, as well as a few smaller nits, e.g. using “build backends” and
“fields” vs “keys” vs. “options”, but that’s not too hard for us to
help clean up. The PyPA
glossary
as well as
the PEP 639 Terminology
section
might be of
help here.

I’d welcome cleanup of these nits. Bearing in mind that the objective is
clarity of the overview; while I definitely do not want wrong
terminology in there, I do want limited verbiage.

Very happy for every technical term’s first use to be a hyperlink to
the glossary and/or specification document.

And as a personal style point, I like abbreviations lke “VCS” to always
be written with their “full name (abbrev)” on first use, eg “version
control system (VCS)” and then just the abbreviation thereon. Not that
there is much of that in the doc.

Cheers,
Cameron Simpson cs@cskk.id.au

By Paul Moore via Discussions on Python.org at 29Mar2022 10:17:

I like this! […]
There’s a few terminology points where your wording seems slightly
unusual to me (as someone used to the packaging ecosystem)[1]. But
nothing significant, and if this does get incorporated into the PUG,
terminology can be tweaked as needed.

Thanks!

I am, indeed, not a native speaker of packaging, which contributed to my
difficulties. I needed a mental framework and short phrasebook, because
my hovercraft is full of eels.

Cheers,
Cameron Simpson cs@cskk.id.au


  1. Nothing specific, just a general feeling that the document wasn’t
    written by a “native speaker of packaging” :wink: ↩︎

1 Like

This is a not-very-new thread, but I wanted to add my $0.02:

  1. @cameron : thanks for working on this – It should be a great contribution to the packaging world

My main note:

" The document aims to outline the flow involved in publishing a package, usually to PyPI."

I actually think this is a problem with the entire realm of pacakging documentiona – the emphasis on “publishing on PyPi”. But isn’t that what people want/need to do?

Well, yes and not – yes, that’s important target group, but:

  1. There are many reasons to build and maintain packages that will never be published on PyPi
  1. The emphasis on publishing means that newbies tend to think they are “supposed” to do that – witness the plethora of half-baked unmaintained packages on PyPi :frowning:

  2. I actually think the publishing part is the easy part – it’s the building part that’s hard, particularly if you have to build C extensions (with or without Cython). – and that gets kinda lost in the shuffle.

Anyway – I’ll go read your page carefully now and comment there.

By Chris Barker via Discussions on Python.org at 14Apr2022 00:19:

This is a not-very-new thread, but I wanted to add my $0.02:

  1. @cameron : thanks for working on this – It should be a great
    contribution to the packaging world

I’m glad.

My main note:

" The document aims to outline the flow involved in publishing a package, usually to PyPI."

I actually think this is a problem with the entire realm of pacakging documentiona – the emphasis on “publishing on PyPi”. But isn’t that what people want/need to do?

Well, that is why I used the term “usually”. I want to set out what has
to be achieved, with a nod of the head at the usual target because (a)
that is often the new packager’s primary objective and (b) because it
gives a mental model of “where” a package may be published.

I’m trying to paint a mental picture here, so that people know where the
pieces fit in that picture. The other docs specify how the various
pieces of string or twine or rope must be made; I’m aiming for the shape
of the frame.

Well, yes and not – yes, that’s important target group, but:

  1. There are many reasons to build and maintain packages that will
    never be published on PyPi

Yes. Again, “usually” PyPI, particularly for the new packager.

I particularly do not want to complicate the overview with all the
possible ways one might want to package+publish something. That…
burgoening of possibilities is what makes the other docs so hard to
read. I’m after the flow, annotated with typical targets (PyPI) to
provide context.

  1. The emphasis on publishing means that newbies tend to think they are “supposed” to do that – witness the plethora of half-baked unmaintained packages on PyPi :frowning:

Well, perhaps. But if you’re not publishing (even to yourself, locally),
why would you need to package?

And if an overview makes it easier to fully bake and later maintain (by
releasing updates) a package through having a better conceptual grasp of
the framework, shouldn’t that help?

  1. I actually think the publishing part is the easy part – it’s the
    building part that’s hard, particularly if you have to build C
    extensions (with or without Cython). – and that gets kinda lost in the
    shuffle.

To me it is the pair. Yes building is hard, not because it is hard, but
because it is hard to grasp.

Anyway – I’ll go read your page carefully now and comment there.

Many thanks!
Cameron Simpson cs@cskk.id.au

By Chris Barker via Discussions on Python.org at 14Apr2022 00:19:

This is a not-very-new thread, but I wanted to add my $0.02:

Also, I’ve made a topic this morning over here:

for getting this into the PyPA site. Maybe we should be over there?

Or maybe we should bikeshed/critique to doc here, and pursue process
over there?

Cheers,
Cameron Simpson cs@cskk.id.au

Well, perhaps. But if you’re not publishing (even to yourself, locally),
why would you need to package?

Ahh – that’s my point – perhaps we’re getting caught up in the two definitions of “package”:

  1. a collection of modules accessible to python via “import” e.g. “a dir with an init.py”

  2. The thing one puts in PyPi (also sometimes called a “distribution”, I think).

The two are not unrelated, but I"m talking about definition (1) here. And if you have even one bit of Python code that you want to use on more than one context, or an application that’s more than one file – then you really need to make a type-1 package out of it and have a way to install it in various virtual environemnts, etc.

Yes, that’s not strictly required – if you mess with PYTHONPATH or sys.path, or … but it really is the way to go.

I suppose that’s “publishing”, but I don’t think most folks would think of it that way.

I think the emphasis on type-2 packaging in so much of the docs leads newbies t oone of two incorrect confusions:

  1. "I don’t want to share my code with the world – I don’t need any of the packaging stuff?

or

  1. “I have a collection of utilities I want to be able to use in my various scripts – I guess I need to make a package and put it on PyPi”

@ cameron: Your doc seems very much oriented to type-2 packaging, which is a fine thing – I only suggest that that be made clear – and I wish that there were better docs for type-2 packaging out there. Which I haven’t written so I can just shut u now :wink:

1 Like

By Chris Barker via Discussions on Python.org at 14Apr2022 03:10:

Well, perhaps. But if you’re not publishing (even to yourself,
locally),
why would you need to package?

Ahh – that’s my point – perhaps we’re getting caught up in the two definitions of “package”:

  1. a collection of modules accessible to python via “import” e.g. “a
    dir with an init.py”
  2. The thing one puts in PyPi (also sometimes called a “distribution”,
    I think).
    […]

Yes, you’re right.

My doc is all about type 2: publishing a package< particularly the
metadata etc needed to make it possibly to publish to PyPI (or similar).

I suppose that the entire PyPA is about type (2) as well.

Cheers,
Cameron Simpson cs@cskk.id.au

To a large extent, yes it is.

If someone only wants to share code between a bunch of local scripts, there’s not really any well-defined solution. It’s basically “roll your own”, or use the same tools as people use to publish to PyPI (wheels, pip, etc) to share your code - for that you need somewhere to keep your “local” distribution files, but either a simple directory or a local package index is fine for that.

It would likely be useful to have some documentation under the PyPA banner covering how to share code locally as you describe. But currently there isn’t any, and no-one has offered to write any. I’d suggest that it needs to be kept clearly distinct from the “publish your code” documentation, though, as the trade-offs and compromises are very different.

For a long time now, I’ve argued that there’s not much help out there for people just wanting to write a bunch of Python scripts, either for themselves, or to share with colleagues/friends (but not publishing for the whole world to use). A lot of my coding is of this form, and frankly I don’t have any good solutions myself. I keep creating half-baked “this will do for now” solutions, but it would be nice to have some sort of recommended approach. PEP 582 – Python local packages directory | peps.python.org (which is currently draft, and appears to have lost momentum) is one possibility, as is installing a bunch of stuff in your “main” Python installation (or a dedicated virtualenv intended for running all these scripts). But no-one (in my experience) really shares suggestions or best practices in this area.

3 Likes

If someone only wants to share code between a bunch of local scripts, there’s not really any well-defined solution.

FWIW I install all these “local libraries” editably, which results in a reasonably nice workflow (but prevents migration to pyproject.toml for now, as I also have a few compiled components, which more or less forces (AFAICT) the use of setuptools with setup.py for now).

Editable installs don’t really make a difference here. What’s more important (to me) is where you install them to. Do you put them in your system Python and run all the scripts that depend on them from there? What about PyPI dependencies like jupyter or pandas? Do you also install them into your system Python? Or do you use virtual environments and install your local libraries into every virtualenv you use? Or something else?

(I guess editable installs avoid the need to reinstall when you change the local library, but that’s not really the problem I’m concerned about here).

Ah, yes, my point was mostly about local changes not requiring reinstalls.
On Arch Linux (my main distro) I install local packages as --user --no-deps (these options are actually set in my pip.conf), trying to stick as much as possible to distro-packaged packages if possible. (For third-party packages not provided by distro packages, I actually wrote GitHub - anntzer/pypi2pkgbuild: A PyPI to PKGBUILD converter. to auto-create PyPI packages to Arch Linux packages.) On other Linux distros, I install everything as --user. (This is for my “main working set” of day-to-day scripts which need to have consistent version bounds; for separate tools which “just happen to be written in Python” but on which I am not doing development (e.g. black, cookiecutter, etc.) I just put them in their own venv which I can easily remove and recreate.)