Structure of the Packaging Strategy Discussions

pradyunsg · February 3, 2023, 2:03pm

(I’m pulling this out to a separate discussion since this is broadly applicable; @smm please correct me if I’m wrong!)

The whole plan with these discussions is to have targeted-ish topic discussions, to gather the thoughts + ideas + concerns + insights that people have around it (and see if that gets us in a positive direction!). Here’s the relevant quote from what @smm said when proposing these:

This exercise is similar to what was done for https://pypackaging-native.github.io; except that it’s happening on Discourse (because we collectively voted for that!), is much more broadly scoped and has a wider audience. I’m optimistic that it’ll be useful in a similar way that has been useful so far – it’s gathering the concerns and various thoughts into a single location and someone’s gonna do the work of making sense from those discussions.

Notably, as currently structured, this is a slower and more structured approach than we’ve usually have had with our discussions. It’s also unenforced structure with basically soft suggestions of “let’s keep this focused on user support / unification / deprecations etc” and seeing where folks take the discussion. I think that’s fine!

These discussions are not where we’re picking a direction or strategy or what-you-want-to-call-it – we’re discussing what we collectively know today to establish a better shared understanding, and seeing what we can come up with to make progress on those specific topics.^[1]

I’ve now seen multiple people express some level of frustration-ish feeling around these discussions (on this forum and elsewhere). While I empathize that talking about problems and issues we’re dealing isn’t particularly pleasant, and that it’ll make it more likely that you’ll want to act on it sooner^[2], I also think there is no reason to feel a sense of urgency here. We’re not going to magically/quickly solve issues that are happening at a larger-than-ever scale and that have grown into their current shape over more than a decade!

Personally, I’m optimistic that we’ll benefit from the shared context that these discussions are helping establish (and, they’ve already done a lot of that). I’m also hoping that we will make meaningfully positive changes to how we’re operating more broadly! These are separate things that will inform each other.

That’s a big part of why I’ve not been saying too much on them – I don’t have strong opinions on a lot of this and absorbing the opinions/thoughts presented is enough energy already! ↩︎
I expect that some of you know that this sentence basically describes how I’ve operated over the last few years! ↩︎

steve.dower · February 3, 2023, 2:48pm

Thanks for writing this up, I agree entirely.

We’ve had a few years now to convert to an “all online, all the time” mode of operation, but ultimately there are some kinds of discussions that really do require a room full of people. What we’ve been doing in these threads is sometimes called “level-setting,” where a group of diverse experiences have all identified a single problem but need to reach a common understanding of it before any unified forward movement can begin.^[1] So far, I think we’ve handled it incredibly well for an online discussion.

What seems to be the ideal (for most people) for these kinds of discussions is to be physically co-located for an extended period (~days to weeks, depending on the scale - I’d expect days for this one) and producing some concrete artifact at the end of it. Less ideal is to have regularly scheduled meetings in amongst other distractions, and at the bottom end is to have an online-only, text-only, open-invite discussion without a specific goal (sound familiar? )

Since we haven’t even agreed upon a suitable goal (I give examples below), it’s going to feel like we’re making no progress at all. Especially since we’re all in such different places on the whole issue, so there’s a lot of level-setting required. And also because people arrive late to the discussion, sometimes don’t read everything preceding their arrival, and so we have to restart all the context for them.^[2] It’s very hard to do this without sounding (or becoming!) exasperated, and unfortunately that affects the tone for everyone else in the discussion even when they are up to speed.

In terms of deliverable or a goal, it’s fairly typical to produce a document or two of some kind. For example, when I participated in Microsoft’s response to dependency confusion, our goal was an internal policy document and an external guidance whitepaper. But it took six months of regular 2-3hr meetings between 20+ stakeholders to establish what needed to be in those, let alone what was accurate or useful to put in them. A lot of that time looked very much like our packaging discussions have been, so I’m also not concerned about how we’re doing

The pypackaging-native site is another great example of a possible goal (well, not now that it’s been done ). A survey of the problem, summarised and presented in a way that lets anyone discover the context without having to have been part of the days of discussions to get all the information out. I hope that one of the results from our discussions will be like this, and we definitely made progress toward agreeing on what we recognise as the fundamentals of the problem.^[3]

So yes, I’m feeling positive about the discussions. I’m a bit sad that I’m likely to miss all the in-person chats that will happen at various US-based conferences this year (though hopefully I can make some of the Europe ones), because those will be very beneficial. However, I would like to see us at least agree on a goal, which I assume will be some kind of document for us to refer to as the discussions move from level-setting into brainstorming, design and planning.

Get everyone “up to the same level” while also collaboratively figuring out what that level is - hence, level-setting. ↩︎
One of the biggest advantages of online-only, text-only is that literally all the past discussion is available for people to catch up with. If you miss the first day of in-person talks, you’ll be lucky to get a formal summary. ↩︎
Because the complaints aren’t the fundamental problem. Complaints are typically [best to treat as] cries for help because of the fundamental problem, and so while we can and should respond directly to the complaint to treat the symptom, we also are the ones who need to dig deeper and find and fix the root causes. ↩︎

btskinn · February 3, 2023, 5:59pm

This might not be 100% on topic to the thread, but I think it’s worth emphasizing:

Python has climbed to the #1 spot in multiple rankings of programming languages, IN SPITE OF its challenging packaging story.

Think about that for a minute.

I mean, really think about it. About what it says about Python and all of the hard work done on it, and on packaging Python.

Python packaging is something that numerous people have always (for ‘Internet’ values of always) had complaints about. As the discussions here, elsewhere, and elsewhen have described in gory detail, the challenges are big and complex, to the point that it’s hard to even see the full scope of the elephant.

But Python is successful. It’s an amazing thing.

And Python packaging does allow many, many people to do awesome work, even though numerous aspects of it can be a major headache sometimes. Python packaging also is a successful, amazing thing.

Even amid this possibly-neverending, difficult, slow work to make packaging better, IMO we would do well to keep this firmly in mind.

Kwpolska · February 3, 2023, 7:11pm

I don’t think referring to rankings is a good choice. The oft-cited TIOBE ranking, for example, is compiled by counting hits for a +"<language> programming" query in a bunch of search engines. It might correspond to the number of online educational resources and online discussons (although with a limited scope, since those might not say “Python programming” but just “Python”), but that might not be enough to capture the popularity of a language on the job market.

The popularity of Python does not at all mean it is a perfect language. Like every language, there are things that work better in Python than in other languages, and simultaneously there are things that turned out to be bad ideas after some time in the wild, or things that could be improved to make the language better for its users (the GIL would probably be the best example).

Furthermore, the popularity of the language does not mean the ecoystem around it is perfect or should be immune to criticism. Yes, it is possible to create and consume packages with Python. But many people believe the way it works is very far from the ideal, and would disagree from calling it an “amazing thing”.

pradyunsg · February 3, 2023, 7:42pm

No one said this or implied this. You’re writing to an audience of people who’ll be among the first to point out that such a thing does not exist.

No one said this or implied this. You’re writing to an audience of people who listen to the criticism and are responsible for figuring out the way forward.

Duly noted. I suggest we refrain from further discussion on this here – we’re firmly off topic now, and I’d much rather that the rest of this discussion be about the structure of the strategy discussions.

rgommers · February 9, 2023, 10:04pm

Thanks for capturing what these discussions are about @pradyunsg and @steve.dower.

I agree that would be a next step that’d be both logical and super useful. I have not seen anyone volunteer for this though - is this actually planned? If you’re inferring it from @smm’s last message on the strategy part 1 thread:

I am in the process of writing a blog post summarizing this thread and setting out a way forward. While there has not been a clear consensus on what unification will look like, I do feel there is enough consensus that this should be taken further to flesh out the details.

then I had the opposite impression. A blog post is unlikely to make a dent here, given the number of topics, types of users, constraints, etc. The amount of writing that is needed here to produce the website or document is substantial, so it’d be reassuring if there were a plan for doing this before diving into strategy threads parts 2 to 5.

pradyunsg · February 11, 2023, 1:00pm

You make an excellent point, and as far as I can tell: No, not at this time.

Indeed.

Too late for that – we’re already 40 posts into the part 2 discussion (Python Packaging Strategy Discussion - Part 2).

FWIW, the blog post that Shamika mentioned is up: Python Software Foundation News: Python Packaging Strategy Discussion Summary - Part 1 – it certainly doesn’t cover the details to the same extent as I’d hoped it would.

jagerber · February 19, 2023, 1:23am

I’m a total newcomer here and also don’t know much about packaging but am frustrated by the current state of things and I think I am one of the target users trying to be helped by this “new strategy”. Let me know if I’m out of line here, but maybe my naïve perspective might be valuable.

A suggestion I would make about the structure of these discussions: it would be nice to have one forum (perhaps a thread or series of threads in this subforum) for the desiderata (i.e. requirements) for the packaging/environment management strategy and another similar/parallel forum where the capabilities of existing tools are laid out and compared and contrasted. It may even be reasonable to have a third forum where shortcomings of the existing tools/workflows could be laid out and discussed but this ground may be covered by the other two.

It seems to me like these are the things which need to be collectively understood before first steps can be taken towards deciding on and eventually implementing some “strategy”. Unfortunately I think these three things are all being discussed piecemeal in the existing forums in a way that makes it very challenging for me (as a newbie in the field) to follow what is going on or to attempt to contribute any thoughts.

I’m imagining something like: Someone posts a bulleted list of desiderata for the new strategy then people pick apart bullet points, try to remove some or add new ones or make modifications, then the list is updated, or maybe someone posts a counter-list etc. Maybe there’s a “dream” list and “easiest-to-implement list” and a “moderate” list or something, whatever. The conversation could be messy but, ideally, it wouldn’t be bogged down by scattered capabilities of existing tools, which seems to be something that is happening a lot in other threads. Again, perhaps I just don’t follow well enough and everyone in the discussions implicitly understands (1) the desiderata and (2) the capabilities of existing tools and has moved onto more practical details, but even then, it would be good to have these things laid out somewhere for someone like me to be able to follow.

Apologies if such forums already existing and I missed them. I would much appreciate being pointed towards them if they do or having an explanation why having discussions split up like this is not an idea that should be pursued.

steve.dower · February 20, 2023, 1:42pm

That’s more or less the structure we currently have. But because they’ve been discussions that progress over time, if you haven’t been following along from the start there’s no real easy way to see the summary.

Picture it like a 4 hour in-person meeting, where most of the tools are represented, and people get their chance to speak, but we also intersperse discussion about certain points as they’re raised.

Now if you weren’t there, when you come to the threads where it happened, it’s like you’ve been handed a 4 hour video of the entire meeting and told to watch it. Not easy to see the structure, as it’s mostly for the benefit of the participants - not the benefit of the observers.

So I think as a group, those of us who have been participating the whole time are doing what you suggest, and have actually been making progress. A few of us are going on the Talk Python podcast this week to share on it, so that will hopefully be of more value to catch people up on the range of things we’ve covered. We’re also still hopeful that someone will write things up in a consumable format, but nobody has volunteered for that (and I think a lot of us still feel like we’re in early enough stages that putting anything in writing will be more harmful than helpful right now).

Also, we’ve split up discussions in the past. What we’ve found is that without everyone having all the context, when we come back together nothing makes sense. This is our strongest attempt yet at trying to define the unifying part of the strategy, so that it becomes possible to have fruitful discussions on specific topics. My personal opinion is that it’s still way too early to split things up, and we’ve still got a lot to work through as a big group first. At internet speeds, that will take months (but the bright side of it taking that long is we can integrate other information as we go - a week-long in-person gathering isn’t able to do that so easily).

Edit: Oh, and if you actually want to see the discussions, they’re all here on Discourse. Basically every thread with over 100 replies in this category is part of it.

btskinn · March 4, 2023, 4:21am

After letting things stew in the back of my mind for a while, I think that this set of packaging discussions may have started in the wrong place.

I’m not fully caught up on both threads currently, but as best I can tell the discussion started with a focus on tools and functionality—and I don’t think this is the right starting point.

Where I think it needs to start is on people and their needs.

If we don’t have a good picture of who all is out there, and what they need to do with (a) distribution of Python itself and (b) Python package creation, distribution and installation, how can we correctly understand the scope of the challenges, and then make sensible decisions about how many and what kind of tools need to exist, with what functionality? (And yes, I’m in ‘Camp One Tool to Rule Them All Is Flatly Impossible™.’)

As I see it, the solution has to involve^[1]:

Developing an understanding of the landscape of myriad Python distribution and packaging situations and user needs
Identifying where those situations/needs have mutually-exclusive constraints
Defining a relatively small number of large swaths of the landscape, each of which can be served by single tools (existing tooling meeting real-world needs can help us here)
Developing (or continuing to develop) those tools to meet those scoped needs
Communicating which is the right tool for its corresponding application spaces
And then, as bandwidth and energy allow, filling in the chinks that are left over

In particular, I think it’s key to recognize that (1) is an extremely high-dimensional problem, spanning different:

Types of user roles (distributor, packager, installer, …)
Levels of expertise in different areas, including varying expertise within a single user
Platforms (OS, PC/mobile/browser, …)
Project code types (pure Python, C extensions, Rust extensions, …)
Project types (library, tool, application (further divided to web/CLI/GUI/…), data analysis/science/ML)
Runtimes (CPython, PyPy, Micropython, …)
plus more…

Even representing this full landscape seems like a grand challenge in its own right:

But, I don’t see how we effectively move to (2) and beyond^[2], without at least a reasonably complete picture of this elephant…of the full sweep of who is building and using Python distributions and packages, for whatever reasons and with whatever goals^[3].

Without tackling the first hard problem^[4], we can’t effectively tackle the second.

It seems to me that most of the discussions to date have been centered on (4) of this list. Occasionally, conversation has touched on some elements of (1), but in a fragmented and incomplete way—certainly not enough to set the stage effectively for all the attention we’ve paid so far to (4). ↩︎
I think the un- or indirectly-addressed considerations of (2) have led to frustration when one party asserts something about a tool, which another party immediately sees as incompatible with their needs. ↩︎
@rgommers’ pypackaging-native seems to me one example of work toward both (1) and (2), at least for a slice of the ecosystem. ↩︎
Which is arguably the hard-er problem! ↩︎

smm · March 9, 2023, 5:05pm

You are right. We will have to identify the various user groups that exist within Packaging ecosystem and see how best to support them. But everything you have suggested is time and effort intensive. Before I ask volunteers to devote their time on a new project (this could be unification/better user support/enabling long term contribution), I have to gauge whether there is any interest in the maintainer community to go down this avenue.

From these discussions, I will know if the maintainer community is on board with each strand of the strategy discussions. If there is interest, as a community we would delve into each strand in more detail and do exactly what you have mentioned.

smm · March 9, 2023, 5:18pm

There seems to be some discrepancy in expectations here. Given that the strategy discussions have just started and I am trying to define a high-level strategy, what do you expect to see in a document? This doc could be one of the deliverables once we do a deep-dive.

ambv · March 14, 2023, 2:26pm

22 posts were split to a new topic: Python packaging documentation feedback and discussion

rgommers · March 9, 2023, 9:08pm

It feels like they’re winding down to be honest. There are two super long threads full of valuable info (if in a very messy form):

That’s 3.5 month worth of discussion, and almost 500 posts (plus a few blog posts and other spinoffs). The Python Packaging Strategy Discussion - Part 2 thread in contrast is only 65 posts (and less new/insightful perhaps too), and no one has posted there in 15 days now.

Without a much clearer plan for following up on these discussions, I think the energy is going to continue ebbing away.

@steve.dower gave a couple of good examples/suggestions higher up already: a policy doc, a whitepaper, a website like pypackaging-native. @lwasser’s guide is also a good example of a possible outcome - she has done a really nice job of iterating on content, asking maintainers (which overlap with participants in the threads here) for feedback on accuracy of what she wrote and whether the guidance represents best practices. That kind of effort helps distill shared knowledge as well as open issues and pain points. Finally, updates to existing documentation would also be a great outcome.

Given the breadth of topics we’ve covered so far, I’d expect at least 2-3 separate documents, possibly in very different formats.

I do not think that is possible, beyond setting out how to go from these threads to those documents, and what they are (topics, format). The questions and goals are still ill-defined, and for the main questions/goals for which we have a rough understanding of what they are, there is no consensus opinion on an answer.

pf_moore · March 9, 2023, 9:37pm

I agree. The lack of any substantial results from the first discussion has certainly put me off investing any time or energy into the second and any subsequent ones.

smm · March 13, 2023, 11:24am

Couple of suggestions that have come in regarding how to proceed with these strategy discussions, that I feel I need to clarify what I see as the expected deliverable from these 5 strategy threads (2 completed, 3 planned).

The 5 topics for strategy discussions were chosen based on feedback from the user survey and key observations around how this community works. The strategy discussions are a way to validate whether each strand should be taken further. This is the reason why each prompt has been a low ball question to gauge how strong the signal is for each prompt. They are not meant to be deep-dives that would end in a white paper.

The expected deliverable from these 5 threads is a high-level strategy document that will indicate which threads (workstream) we will be pursuing in depth and who will be working on each workstream. I expect this deep-dive to produce the policy doc/white paper/technical roadmap that is expected by many in this thread. Generally, I expect the strategy to look like this

Next 3-6 months- Produce high-level strategy doc, set up workstreams and working group for each workstream, define objectives and deliverables for each working group
Next 1 year- Deep dive into each workstream, targeted user surveys as suggested by multiple people in this thread, options analysis and building consensus for the chosen solution(s), define technical roadmap
Next 2-5 years- Develop solutions and deliver

I do not expect the deep-dive to take one year but building consensus might take a year. I have to see how each thread plays out in the community before we get to the point of developing the white paper. The suggestions regarding user surveys and policy docs that have come in via this thread are perfectly valid but it is too soon to do it right away, at least in my opinion.

steve.dower · March 13, 2023, 5:52pm

This thread is definitely getting off topic (I know that because I’m sure I just replied to some of these comments on another thread )

Shamika’s post seems to be the last on-topic one.

So how do we feel about this as the outlined approach?

ambv · March 14, 2023, 3:29pm

A post was merged into an existing topic: Python packaging documentation feedback and discussion

pf_moore · March 13, 2023, 6:09pm

Personally, I find it disappointing, if I’m honest. Workstreams and working groups doesn’t sound like the sort of thing I’d expect in volunteer-driven open source work. It feels too “management heavy”, and I don’t like the implication that things get done “in private” (I imagine working groups will use face to face calls and similar, which are much better high-bandwidth ways of having discussions, but they do exclude non-participants, even in the best cases).

On a personal note, I want to be involved in this work, but I cannot imagine staying motivated over 18 months of nothing but discussions, planning and consensus building^[1]. I think it’s the wrong approach, and I’d much rather we identified and worked on small, independent changes that produce measurable, incremental improvements. Basically, the classic agile approach.

I’m also bothered by the idea that we can somehow just magically find resources for these workstreams. Many of the people doing the most work in the packaging ecosystem don’t participate heavily in these discussions, because they are busy actually developing tools^[2]. We don’t want to divert them from that, but equally, what use is a working group that doesn’t include the key players?

Sorry. I don’t want to be negative, but that’s my honest view…

The first strategy discussion pretty much burned everyone out, to the extent that the second one is struggling to get interest. How will 18 months of that fare? ↩︎
I’m carefully not looking in the mirror at this point, because I won’t like what I see ↩︎

sinoroc · March 13, 2023, 7:41pm

I do not think this makes sense. Who out of the people working on Python packaging will still be here 5 years from now? Especially volunteers who just come and go as time, energy, and motivation permit. Who will ensure continuity of this work over 5 years? I feel like this is the wrong approach entirely.

How can this be applied to volunteers?

I guess, probably I am just not the target audience at all for this message, and I am missing a lot of context.