Structure of the Packaging Strategy Discussions

After letting things stew in the back of my mind for a while, I think that this set of packaging discussions may have started in the wrong place.

I’m not fully caught up on both threads currently, but as best I can tell the discussion started with a focus on tools and functionality—and I don’t think this is the right starting point.

Where I think it needs to start is on people and their needs.

If we don’t have a good picture of who all is out there, and what they need to do with (a) distribution of Python itself and (b) Python package creation, distribution and installation, how can we correctly understand the scope of the challenges, and then make sensible decisions about how many and what kind of tools need to exist, with what functionality? (And yes, I’m in ‘Camp One Tool to Rule Them All Is Flatly Impossible™.’)

As I see it, the solution has to involve[1]:

  1. Developing an understanding of the landscape of myriad Python distribution and packaging situations and user needs
  2. Identifying where those situations/needs have mutually-exclusive constraints
  3. Defining a relatively small number of large swaths of the landscape, each of which can be served by single tools (existing tooling meeting real-world needs can help us here)
  4. Developing (or continuing to develop) those tools to meet those scoped needs
  5. Communicating which is the right tool for its corresponding application spaces
  6. And then, as bandwidth and energy allow, filling in the chinks that are left over

In particular, I think it’s key to recognize that (1) is an extremely high-dimensional problem, spanning different:

  • Types of user roles (distributor, packager, installer, …)
  • Levels of expertise in different areas, including varying expertise within a single user
  • Platforms (OS, PC/mobile/browser, …)
  • Project code types (pure Python, C extensions, Rust extensions, …)
  • Project types (library, tool, application (further divided to web/CLI/GUI/…), data analysis/science/ML)
  • Runtimes (CPython, PyPy, Micropython, …)
  • plus more…

Even representing this full landscape seems like a grand challenge in its own right:

But, I don’t see how we effectively move to (2) and beyond[2], without at least a reasonably complete picture of this elephant…of the full sweep of who is building and using Python distributions and packages, for whatever reasons and with whatever goals[3].

Without tackling the first hard problem[4], we can’t effectively tackle the second.


  1. It seems to me that most of the discussions to date have been centered on (4) of this list. Occasionally, conversation has touched on some elements of (1), but in a fragmented and incomplete way—certainly not enough to set the stage effectively for all the attention we’ve paid so far to (4). ↩︎

  2. I think the un- or indirectly-addressed considerations of (2) have led to frustration when one party asserts something about a tool, which another party immediately sees as incompatible with their needs. ↩︎

  3. @rgommerspypackaging-native seems to me one example of work toward both (1) and (2), at least for a slice of the ecosystem. ↩︎

  4. Which is arguably the hard-er problem! ↩︎

5 Likes

You are right. We will have to identify the various user groups that exist within Packaging ecosystem and see how best to support them. But everything you have suggested is time and effort intensive. Before I ask volunteers to devote their time on a new project (this could be unification/better user support/enabling long term contribution), I have to gauge whether there is any interest in the maintainer community to go down this avenue.

From these discussions, I will know if the maintainer community is on board with each strand of the strategy discussions. If there is interest, as a community we would delve into each strand in more detail and do exactly what you have mentioned.

There seems to be some discrepancy in expectations here. Given that the strategy discussions have just started and I am trying to define a high-level strategy, what do you expect to see in a document? This doc could be one of the deliverables once we do a deep-dive.

22 posts were split to a new topic: Python packaging documentation feedback and discussion

It feels like they’re winding down to be honest. There are two super long threads full of valuable info (if in a very messy form):

That’s 3.5 month worth of discussion, and almost 500 posts (plus a few blog posts and other spinoffs). The Python Packaging Strategy Discussion - Part 2 thread in contrast is only 65 posts (and less new/insightful perhaps too), and no one has posted there in 15 days now.

Without a much clearer plan for following up on these discussions, I think the energy is going to continue ebbing away.

@steve.dower gave a couple of good examples/suggestions higher up already: a policy doc, a whitepaper, a website like pypackaging-native. @lwasser’s guide is also a good example of a possible outcome - she has done a really nice job of iterating on content, asking maintainers (which overlap with participants in the threads here) for feedback on accuracy of what she wrote and whether the guidance represents best practices. That kind of effort helps distill shared knowledge as well as open issues and pain points. Finally, updates to existing documentation would also be a great outcome.

Given the breadth of topics we’ve covered so far, I’d expect at least 2-3 separate documents, possibly in very different formats.

I do not think that is possible, beyond setting out how to go from these threads to those documents, and what they are (topics, format). The questions and goals are still ill-defined, and for the main questions/goals for which we have a rough understanding of what they are, there is no consensus opinion on an answer.

6 Likes

I agree. The lack of any substantial results from the first discussion has certainly put me off investing any time or energy into the second and any subsequent ones.

3 Likes

Couple of suggestions that have come in regarding how to proceed with these strategy discussions, that I feel I need to clarify what I see as the expected deliverable from these 5 strategy threads (2 completed, 3 planned).

The 5 topics for strategy discussions were chosen based on feedback from the user survey and key observations around how this community works. The strategy discussions are a way to validate whether each strand should be taken further. This is the reason why each prompt has been a low ball question to gauge how strong the signal is for each prompt. They are not meant to be deep-dives that would end in a white paper.

The expected deliverable from these 5 threads is a high-level strategy document that will indicate which threads (workstream) we will be pursuing in depth and who will be working on each workstream. I expect this deep-dive to produce the policy doc/white paper/technical roadmap that is expected by many in this thread. Generally, I expect the strategy to look like this

Next 3-6 months- Produce high-level strategy doc, set up workstreams and working group for each workstream, define objectives and deliverables for each working group
Next 1 year- Deep dive into each workstream, targeted user surveys as suggested by multiple people in this thread, options analysis and building consensus for the chosen solution(s), define technical roadmap
Next 2-5 years- Develop solutions and deliver

I do not expect the deep-dive to take one year but building consensus might take a year. I have to see how each thread plays out in the community before we get to the point of developing the white paper. The suggestions regarding user surveys and policy docs that have come in via this thread are perfectly valid but it is too soon to do it right away, at least in my opinion.

2 Likes

This thread is definitely getting off topic (I know that because I’m sure I just replied to some of these comments on another thread :wink: )

Shamika’s post seems to be the last on-topic one.

So how do we feel about this as the outlined approach?

A post was merged into an existing topic: Python packaging documentation feedback and discussion

Personally, I find it disappointing, if I’m honest. Workstreams and working groups doesn’t sound like the sort of thing I’d expect in volunteer-driven open source work. It feels too “management heavy”, and I don’t like the implication that things get done “in private” (I imagine working groups will use face to face calls and similar, which are much better high-bandwidth ways of having discussions, but they do exclude non-participants, even in the best cases).

On a personal note, I want to be involved in this work, but I cannot imagine staying motivated over 18 months of nothing but discussions, planning and consensus building[1]. I think it’s the wrong approach, and I’d much rather we identified and worked on small, independent changes that produce measurable, incremental improvements. Basically, the classic agile approach.

I’m also bothered by the idea that we can somehow just magically find resources for these workstreams. Many of the people doing the most work in the packaging ecosystem don’t participate heavily in these discussions, because they are busy actually developing tools[2]. We don’t want to divert them from that, but equally, what use is a working group that doesn’t include the key players?

Sorry. I don’t want to be negative, but that’s my honest view…


  1. The first strategy discussion pretty much burned everyone out, to the extent that the second one is struggling to get interest. How will 18 months of that fare? ↩︎

  2. I’m carefully not looking in the mirror at this point, because I won’t like what I see :slightly_smiling_face: ↩︎

9 Likes

I do not think this makes sense. Who out of the people working on Python packaging will still be here 5 years from now? Especially volunteers who just come and go as time, energy, and motivation permit. Who will ensure continuity of this work over 5 years? I feel like this is the wrong approach entirely.

How can this be applied to volunteers?

I guess, probably I am just not the target audience at all for this message, and I am missing a lot of context.

4 Likes

I have mixed feelings.

I agree with @pf_moore that that sounds a bit bureaucratic and is a pretty protracted timeframe. That said, I do feel that these discussions are helpful in getting ideas out there and understanding the various perspectives that people bring to the problem.

Maybe the main specific worry I have about that outline is that it seems too specific for its timescale. It doesn’t seem realistic to me that we can know now that will we spend 6 months talking, 1 year doing a “deep dive”[1], and so on. That would seem to preclude taking earlier action on some low-hanging fruit (like maybe some doc revisions), and it also pushes actual action so far into the future that the ground could change under our feet by then.

Personally (as may be obvious from my own posts :slight_smile:) I tend to think it’s more fruitful for people to just lay their cards on the table and say “I think X is a problem and I think we could solve it with Y”, and then for someone else to say “I don’t think X is a problem” or “I agree X is a problem but I think we need to solve it with Z”, and via a sort of Socratic dialogue converge on agreement on at least some tangible proposals. I understand that can be somehwat draining for those with more knowledge and experience, who are reluctant to participate if it just means a long slog of bringing less experienced folks up to speed on issues that have been rehashed ad nauseum over the years, but I’m not sure to what extent other modes of discussion would ameliorate that.

So far, to put it as optimistically as possible :wink:, I think there is some useful emerging consensus on particular problems or desiderata that are getting close to the level of specificity where progress towards a solution might be possible: a clear pathway for packaging applications, a clear move away from having packages install by default into a system Python, maybe even some more “opinionated” docs. I don’t see so much agreement on solutions at this point though. And there also are still areas where there’s disagreement about whether something is even a problem that needs to be fixed.

As @pf_moore noted, it’s unclear who would eventually implement whatever gets decided on. Is the intent that if there is sufficient agreement on certain courses of action, the PSF will be funding that work? A related issue that has been mentioned here and there in these threads is the disconnect in mission/purview/authority between the steering council and PyPA. These raise some doubt about whether adequate action would be taken even if we could agree on everything.

So all in all I’m okay with the concept of “take some time to hash out the issues, settle on some solutions, and make them happen”. But I’m not sure that committing to a timeline like the one @smm laid out is the best way to go about that.


  1. I don’t think I really understand what this would entail ↩︎

1 Like

I want to preface this by highlighting @pf_moore 's post on the Packaging Strategy Part 2 thread that summarizes many of the concerns people have about this process, and directly led to the creation of this “Structure” thread:

As the person currently responsible for facilitating this process, it would be particularly helpful to hear your viewpoint and responses to the points described there.

Err, well, is the second thread really “completed”? It seems that the discussion primarily petered out on that specific thread not because it was in any way “complete”, nor because the nominal “deadline” elapsed, nor due to lack of interest in the specific topic (as active discussion on the same topic has continued on this thread and others).

Rather, as the other folks here commented, many participants were likely just simply exhausted from the first thread, and perhaps discouraged by the lack of tangible results of the “strategy thread” format, and concluded their limited time and energy, so essential to the continued health of the Python packaging ecosystem, would be better spent elsewhere on more productive avenues.

I bring this up because if the whole goal of the strategy discussions was:

…Then the primarily “signal” is likely merely going to be a strong decay in responses (and overall enthusiasm for the effort) from discussion to discussion regardless of sub-topic.

Furthermore, there are a lot of other confounding variables at play—for example, particularly given the lack of active moderation and guidance (as others have mentioned), the discussion naturally bounced between the different primary topics as well as other unrelated ones within one topic’s thread.

And conversely, a lot of simultaneous discussion about or directly related to those same topics happened in other threads; for example:

Given all of that, it’s hard to see how much useful, reliable “signal” that actually measures the relative community interest and support for each of the topics can actually be gained from this approach, at least as I’ve interpreted the above (perhaps incorrectly).

If that was the primarily “deliverable” here rather than actually having the discussions themselves actually mean something and directly motivate funding and action in the directions participants agreed upon, is there a reason the present approach was taken over a single thread with a Discourse poll, which would seem like a far quicker, less effort-intensive and unreliable (though of course still far from perfect) way to collect such “signal”?

Is the following a correct translating of the envisioned process?

Overall schematic
       User Survey
          / | \
         V  V  V
  Thread 1 ... Thread 5   <--- WE ARE HERE
          \ | /
            V     
High level strategy document 
       /    |    \
      V     V     V
  Stream 1 ... Stream 5
      |     |     |  
      V     V     V
    WG 1   ...   WG 5
      |     |     |  
      V     V     V
 Objectives & deliverables
      |     |     |  
      V     V     V
  Targeted user surveys
      |     |     |  
      V     V     V
     Options analysis
      |     |     |  
      V     V     V
     Choose solution
      |     |     |  
      V     V     V
     Build consensus
      |     |     |  
      V     V     V
 Define technical roadmap    
      |     |     |  
      V     V     V
    Develop solutions
      |     |     |
      V     V     V
    Deliver solutions

To be frank, this whole process seems quite drawn out, bureaucratic and complicated, not to mention volunteer-intensive, a resource which is in critically short supply. It also doesn’t sound close to any of the approaches actually proposed on the strategy thread (competing PEPs with packaging council who decides, evolve pip to meet these requirements, consider new governance structures, get everyone in a room for an in-person summit, etc). Furthermore, it would seem to be at odds with the fundamental established principles of how decisions are made and implemented in the Python and packaging community:

  • Discussions are open to all interested, not just members of defined “workgroups”
  • Ideas are freely proposed, discussed and iterated until consensus is reached, rather than proceeding through a rigid “waterfall” process
  • Solutions are chosen as the result of consensus, not chosen first and then consensus built around them
  • Anyone can propose an idea, but it is the proposer’s responsibility to help provide a reference implementation
  • Proposals are typically accompanied by a working prototype, rather than actual implementation only coming at the very end

Furthermore, who’s the “we” here? Who’s going to be deciding which to pursue in depth? Who’s going to be part of the workgroups? Who’s going to write the roadmaps? Who’s going to develop the solutions?

Assuming the proposed solutions involve creating something other than a brand new tool/website (which would likely end up in a xkcd 927 situation), if the maintainers and community of the affected tool(s) aren’t involved and supportive throughout this process, any proposed solution will (after going through all those steps) be dead on arrival and you’ll have to go back to square one.

Certainly the current process is far from perfect, and it’s well within scope to propose major changes (as we explored in the first strategy thread). However, a full departure from the underlying fundamental principles is likely to not provide a lot of buy-in from the community, without which any resulting strategy unfortunately has little hope of succeeding.

8 Likes

Stepping back a bit, I want to emphasize a broader point. Maybe I missed it somewhere, but it would have been very helpful to have clearly communicated all of this beforehand, before the community invested hundreds of posts and surely at least a comparable number of total volunteer person-hours following and participating in these discussions.

It would of course have better informed people’s responses, particularly those discussing and actively planning community-driven next steps (e.g. competing PEPs and a council to vote on them, changes to pip’s scope, PyPUG improvement, PyPA governance/process adjustments, etc.).

And more importantly, it seems to be a central assumption of those participating that they were the workgroups discussing and proposing the actual solutions. If people were actually made aware up front that the primary goal of the threads was merely “gauge[ing] how strong the signal is for each prompt” [1] rather than their ideas and proposals themselves genuinely playing a direct and meaningful role in the final practical outcome of all of this, I suspect many would have made different choices as to whether to spend their valuable volunteer time and energy participating.


  1. which would then be used as input to a high-level strategy document, which would then inform the creation of workgroups, which would then perform surveys user surveys and options analysis, which would then inform the decision on a solution, around which they would then build consensus, when would then motivate a technical roadmap, which would then motivate the development of solutions, which only then would produce actual results ↩︎

3 Likes

Sorry for the triple post, but I just wanted to shout out the fact that for those interested something like this:

We’ve (@pradyunsg , @FFY00 and I) finally been able to formally open signups and topic proposals for the Python Packaging Summit at PyCon 2023! We’ve currently scheduled it in two separate sessions at the beginning and end of the event to give people some time to think in between, and if there’s enough interest, we could potentially reserve some open space between those two or meet up during the sprints for additional informal discussions. Hoping to see you all there!

FWIW, I believe something like this was one of the original options we were presented with.

Everyone voted for Discourse threads, which is why we’ve been doing Discourse threads.

Just to be clear, I wasn’t criticizing the idea of Discourse threads for this (I voted for it myself), just offering another option we’re providing for those interested.

FWIW, looking back at the post in question, 53% of people voted for Discourse at the time (or if I don’t include my own vote, an even 50%) while 44% voted for virtual meetings, and 3% voted “do not want to participate” (in-person meetings wasn’t an option). I wouldn’t call that “everyone” (and opinions may have changed over time), and it certainly doesn’t mean that there isn’t value in also having in-person discussions.

2 Likes

As someone who won’t be at PyCon US, I hope that the discussions are recorded and published somehow. In the past, we haven’t been good at doing this. I know it’s really hard to record the sort of interactive discussion the summit is intended for, but IMO it’s crucial, with the wide range of people involved in packaging discussions[1], that we don’t end up with people able to attend PyCon being in a privileged position to influence things.


  1. Particularly as one of the good things with the strategy discussions is how they’ve got new people, with different viewpoints, involved. ↩︎

4 Likes

I won’t be there either, so I’m sure everyone will enjoy not having us clog up all the threads :slight_smile:

Thanks for looking it up. I knew after I posted that I should’ve done it myself, but am already struggling to keep up with my tasks today :smiley: Clearly I remembered an earlier state of the results, because that’s much closer than I thought - maybe it’s close enough to justify setting up a virtual meeting?

1 Like