Modernising my packages - am I thinking about this all wrong?

cameron · March 29, 2022, 10:11pm

By Paul Moore via Discussions on Python.org at 29Mar2022 10:17:

cameron:

So I’ve written what I would like to have available, ideally listed just
above the reference to the tutorial:
css/doc/pypa-the-missing-outline.md at pypi · cameron-simpson/css · GitHub
Are you folks open to additng this, suitably revised for correctness?

I like this! […]
There’s a few terminology points where your wording seems slightly
unusual to me (as someone used to the packaging ecosystem)^[1]. But
nothing significant, and if this does get incorporated into the PUG,
terminology can be tweaked as needed.

Thanks!

I am, indeed, not a native speaker of packaging, which contributed to my
difficulties. I needed a mental framework and short phrasebook, because
my hovercraft is full of eels.

Cheers,
Cameron Simpson cs@cskk.id.au

Nothing specific, just a general feeling that the document wasn’t
written by a “native speaker of packaging” ↩︎

ChrisBarker-NOAA · April 14, 2022, 12:09am

This is a not-very-new thread, but I wanted to add my $0.02:

@cameron : thanks for working on this – It should be a great contribution to the packaging world

My main note:

" The document aims to outline the flow involved in publishing a package, usually to PyPI."

I actually think this is a problem with the entire realm of pacakging documentiona – the emphasis on “publishing on PyPi”. But isn’t that what people want/need to do?

Well, yes and not – yes, that’s important target group, but:

There are many reasons to build and maintain packages that will never be published on PyPi

internal code
code that you want only for yourself, but need to have organized (I gave a lighting talk at SciPy on this a few years back: Where to put your custom code? — Python Topics 1.0.0 documentation

The emphasis on publishing means that newbies tend to think they are “supposed” to do that – witness the plethora of half-baked unmaintained packages on PyPi
I actually think the publishing part is the easy part – it’s the building part that’s hard, particularly if you have to build C extensions (with or without Cython). – and that gets kinda lost in the shuffle.

Anyway – I’ll go read your page carefully now and comment there.

cameron · April 14, 2022, 1:05am

By Chris Barker via Discussions on Python.org at 14Apr2022 00:19:

This is a not-very-new thread, but I wanted to add my $0.02:

@cameron : thanks for working on this – It should be a great
contribution to the packaging world

I’m glad.

My main note:

" The document aims to outline the flow involved in publishing a package, usually to PyPI."

I actually think this is a problem with the entire realm of pacakging documentiona – the emphasis on “publishing on PyPi”. But isn’t that what people want/need to do?

Well, that is why I used the term “usually”. I want to set out what has
to be achieved, with a nod of the head at the usual target because (a)
that is often the new packager’s primary objective and (b) because it
gives a mental model of “where” a package may be published.

I’m trying to paint a mental picture here, so that people know where the
pieces fit in that picture. The other docs specify how the various
pieces of string or twine or rope must be made; I’m aiming for the shape
of the frame.

Well, yes and not – yes, that’s important target group, but:

There are many reasons to build and maintain packages that will
never be published on PyPi

internal code

code that you want only for yourself, but need to have organized (I gave a lighting talk at SciPy on this a few years back: Where to put your custom code? — Python Topics 1.0.0 documentation

Yes. Again, “usually” PyPI, particularly for the new packager.

I particularly do not want to complicate the overview with all the
possible ways one might want to package+publish something. That…
burgoening of possibilities is what makes the other docs so hard to
read. I’m after the flow, annotated with typical targets (PyPI) to
provide context.

The emphasis on publishing means that newbies tend to think they are “supposed” to do that – witness the plethora of half-baked unmaintained packages on PyPi

Well, perhaps. But if you’re not publishing (even to yourself, locally),
why would you need to package?

And if an overview makes it easier to fully bake and later maintain (by
releasing updates) a package through having a better conceptual grasp of
the framework, shouldn’t that help?

I actually think the publishing part is the easy part – it’s the
building part that’s hard, particularly if you have to build C
extensions (with or without Cython). – and that gets kinda lost in the
shuffle.

To me it is the pair. Yes building is hard, not because it is hard, but
because it is hard to grasp.

Anyway – I’ll go read your page carefully now and comment there.

Many thanks!
Cameron Simpson cs@cskk.id.au

cameron · April 14, 2022, 1:08am

By Chris Barker via Discussions on Python.org at 14Apr2022 00:19:

This is a not-very-new thread, but I wanted to add my $0.02:

Also, I’ve made a topic this morning over here:

for getting this into the PyPA site. Maybe we should be over there?

Or maybe we should bikeshed/critique to doc here, and pursue process
over there?

Cheers,
Cameron Simpson cs@cskk.id.au

ChrisBarker-NOAA · April 14, 2022, 2:59am

Well, perhaps. But if you’re not publishing (even to yourself, locally),
why would you need to package?

Ahh – that’s my point – perhaps we’re getting caught up in the two definitions of “package”:

a collection of modules accessible to python via “import” e.g. “a dir with an init.py”
The thing one puts in PyPi (also sometimes called a “distribution”, I think).

The two are not unrelated, but I"m talking about definition (1) here. And if you have even one bit of Python code that you want to use on more than one context, or an application that’s more than one file – then you really need to make a type-1 package out of it and have a way to install it in various virtual environemnts, etc.

Yes, that’s not strictly required – if you mess with PYTHONPATH or sys.path, or … but it really is the way to go.

I suppose that’s “publishing”, but I don’t think most folks would think of it that way.

I think the emphasis on type-2 packaging in so much of the docs leads newbies t oone of two incorrect confusions:

"I don’t want to share my code with the world – I don’t need any of the packaging stuff?

or

“I have a collection of utilities I want to be able to use in my various scripts – I guess I need to make a package and put it on PyPi”

@ cameron: Your doc seems very much oriented to type-2 packaging, which is a fine thing – I only suggest that that be made clear – and I wish that there were better docs for type-2 packaging out there. Which I haven’t written so I can just shut u now

cameron · April 14, 2022, 3:42am

By Chris Barker via Discussions on Python.org at 14Apr2022 03:10:

Well, perhaps. But if you’re not publishing (even to yourself,
locally),
why would you need to package?

Ahh – that’s my point – perhaps we’re getting caught up in the two definitions of “package”:

a collection of modules accessible to python via “import” e.g. “a
dir with an init.py”

The thing one puts in PyPi (also sometimes called a “distribution”,
I think).
[…]

Yes, you’re right.

My doc is all about type 2: publishing a package< particularly the
metadata etc needed to make it possibly to publish to PyPI (or similar).

I suppose that the entire PyPA is about type (2) as well.

Cheers,
Cameron Simpson cs@cskk.id.au

pf_moore · April 14, 2022, 9:38am

To a large extent, yes it is.

If someone only wants to share code between a bunch of local scripts, there’s not really any well-defined solution. It’s basically “roll your own”, or use the same tools as people use to publish to PyPI (wheels, pip, etc) to share your code - for that you need somewhere to keep your “local” distribution files, but either a simple directory or a local package index is fine for that.

It would likely be useful to have some documentation under the PyPA banner covering how to share code locally as you describe. But currently there isn’t any, and no-one has offered to write any. I’d suggest that it needs to be kept clearly distinct from the “publish your code” documentation, though, as the trade-offs and compromises are very different.

For a long time now, I’ve argued that there’s not much help out there for people just wanting to write a bunch of Python scripts, either for themselves, or to share with colleagues/friends (but not publishing for the whole world to use). A lot of my coding is of this form, and frankly I don’t have any good solutions myself. I keep creating half-baked “this will do for now” solutions, but it would be nice to have some sort of recommended approach. PEP 582 – Python local packages directory | peps.python.org (which is currently draft, and appears to have lost momentum) is one possibility, as is installing a bunch of stuff in your “main” Python installation (or a dedicated virtualenv intended for running all these scripts). But no-one (in my experience) really shares suggestions or best practices in this area.

anntzer · April 14, 2022, 9:49am

If someone only wants to share code between a bunch of local scripts, there’s not really any well-defined solution.

FWIW I install all these “local libraries” editably, which results in a reasonably nice workflow (but prevents migration to pyproject.toml for now, as I also have a few compiled components, which more or less forces (AFAICT) the use of setuptools with setup.py for now).

pf_moore · April 14, 2022, 10:29am

Editable installs don’t really make a difference here. What’s more important (to me) is where you install them to. Do you put them in your system Python and run all the scripts that depend on them from there? What about PyPI dependencies like jupyter or pandas? Do you also install them into your system Python? Or do you use virtual environments and install your local libraries into every virtualenv you use? Or something else?

(I guess editable installs avoid the need to reinstall when you change the local library, but that’s not really the problem I’m concerned about here).

anntzer · April 14, 2022, 2:34pm

Ah, yes, my point was mostly about local changes not requiring reinstalls.
On Arch Linux (my main distro) I install local packages as --user --no-deps (these options are actually set in my pip.conf), trying to stick as much as possible to distro-packaged packages if possible. (For third-party packages not provided by distro packages, I actually wrote GitHub - anntzer/pypi2pkgbuild: A PyPI to PKGBUILD converter. to auto-create PyPI packages to Arch Linux packages.) On other Linux distros, I install everything as --user. (This is for my “main working set” of day-to-day scripts which need to have consistent version bounds; for separate tools which “just happen to be written in Python” but on which I am not doing development (e.g. black, cookiecutter, etc.) I just put them in their own venv which I can easily remove and recreate.)

ChrisBarker-NOAA · October 21, 2022, 8:04pm

Discourse somehow highlighted this thread for me now (I don’t go here often), so I thought I’d add my $0.02 for the record.

I think the current packaging tools work fine for the not-going-to-distribute-to-the-world packages, i.e. make my stuff available to various scripts on my machine, etc. The only real issue is a lack of docs: the documentation, as pointed out already, is oriented to publishing/distributing packages.

I’ve found that making a package is exactly the way to make code available on my and my team’s system(s). It’s really painful to write any more than a single file script without using the packaging infrastructure.

I addressed this in a lightning talk at SciPy a few years back – oriented particular to the SciPy crowd. It’s published here (and yes, it is out of date) here:

http://pythonchb.github.io/PythonTopics/where_to_put_your_code.html

Not only out of date, but simplified to a particular use case and of a level of complexity that could fit into a lightning talk.

I really should expand (and modernise) that, and add more notes about how to actually use the resulting package: venv and conda environments, editable installs vs built installs vs wheels, etc. All the issues brought up on this thread.

But in short – I think the tools are there for these use cases, we just need to tell people about them, and how to use them.

NOTE: I’ve been doing this kind of thing for years – both for myself only, and for packages I share with my team. It has worked great (and boy was I happy when I discovered it way back when – fiddling with PYTHONPATH and sys.path was painful)