Maintaining the chunk module after it has been removed from the standard library

I respectfully submit that the chunk module has not outlived its usefulness and continues to play a crucial role in modern Python programming.

First and foremost, the interchange file format remains widely utilized in 2023, and the chunk module facilitates reading and writing various contemporary file formats. This includes essential formats like MIDI and WAV files, as well as several video formats and even proprietary file formats. The reliability of this module is evident in its continued usage, particularly in data logging applications.

Given that the interchange file format has a history spanning at least four decades, it is unlikely to undergo significant changes anytime soon, if ever. In my experience, it has remained remarkably consistent throughout the years.

Furthermore, the chunk module is characterized by its simplicity, which means that it primarily requires updates to remain compliant with new language changes, such as those introduced during the transition from Python 2 to Python 3.

I acknowledge that utility alone might not be sufficient to justify the inclusion of a module in the Python standard library. However, the interchange file format has served as a foundation for adapting to numerous other formats and has found practical applications in diverse fields. This versatility and practicality make a compelling case for the continued inclusion of the chunk module in Python’s standard library.

In conclusion, I believe that retaining the chunk module in Python is not only warranted but also beneficial for the broader Python community due to its enduring relevance and valuable functionality in handling various file formats.

5 Likes

If you are an user of the chunk module, just republish it on PyPI and maintain it. It’s just 170 lines of code and 140 lines of documentation. It has no tests, so the maintenance is even simpler :joy:

7 Likes

I think both the aifc and the chunk module deserve a reevaluation under PEP 594. While the AIFF/AIFC audio file formats are old, they are stable and still widely used as common audio interchange formats.

PEP 549 already includes a note related to this for the aifc module. The same applies to the chunk module. Maintenance overhead is very low on both modules and while you can always make the argument “simply put this up on PyPI”, the “batteries included” argument is a rather strong one in companies that restrict use of PyPI packages due to security concerns.

BTW: I think we ought to start having community maintainers for stdlib modules and packages that currently don’t have core devs associated with them. Maintenance for stable modules does not require a lot of work and can easily be had via PRs from knowledgeable people from the community. Requiring the core dev status for this is setting the bar way too high.

11 Likes

Commit rights are commit rights, we don’t have the infrastructure to only allow committers to commit changes to certain parts of the codebase.

If we trust a contributor enough in their area, it’s easy enough to merge the fixes. I’m pretty sure the issue is lack of contributors, not lack of a commit bit.

2 Likes

That’s what I understood @malemburg to be saying - let’s formalise the idea that a non-committer can be the “community maintainer” for a module, trusted to create PRs, review them and make decisions about the module. Without the commit bit, they would still rely on a core dev giving their work a final review, but as you say that can be straightforward.

Establishing that level of trust is on a contributor - it’s not enough just to say “I’d like to do it” - but it’s easier to demonstrate trustworthiness if it’s in one specific area.

The only difference from what we do already is a formal acknowledgement. But some people are motivated by being acknowledged like this, and others are put off from contributing by a feeling of “it’s not up to me to make a decision”. Both types of contributor would benefit from knowing that this is a recognised practice.

3 Likes

Sure, we have a pathlib maintainer now because we went ahead and just did that, but it really helped that someone put their hand up and then stuck with it for a year.

I’m not sure what kind of formal acknowledgement you have in mind, but I’m always hesitant to reward before the contributions. If we don’t know the person yet, at most we can say that they’re about to start contributing. Once their contributions are shown to be good and trustworthy, why not acknowledge with commit rights?

The middle ground just seems like motivation for people who want clout but don’t inherently care about the contributions.

2 Likes

Right, this kind of formal acknowledgement is what I was thinking off. Sorry, if I wasn’t clear enough.

Core dev maintenance burden is the number one argument I hear from people who want to strip down the stdlib and I very much believe that we can leverage the overgrowing Python community to help with this by formalizing a process for non-committers to take over large parts of this maintenance, similar to what we do for “triagers”.

Committers would then just do a final review of the PRs and merge them.

This would not only help with modules listed in PEP 594, but also with other modules who don’t have maintainers or not enough maintainers.

1 Like

The core dev process is still way too heavy weight and usually takes months to years of active participation to result in a commit bit being set.

So it sounds like the proposal is for some kind of “trusted reviewer” mark, where once we trust that a particular contributor is able to evaluate the tradeoffs for a particular module, and can apply the subjective checks that we normally apply (PEP 7, docs/NEWS quality, etc.), committers should be encouraged to treat a review from that person as a signal to merge without further review?

Not “trusted reviewers”, I’m suggesting that we have trusted “community maintainers”: people who put their hands up to maintaining a stdlib module, write quality PRs where needed and submit them for final check off by core devs.

This would work great for modules which are stable and don’t need much maintenance.

1 Like

Like I said, we’ve already done that in the past. Nothing is stopping us from doing it again in the future, other than a lack of people putting their hands up.

3 Likes

Yep, and the devguide’s experts index shows the “maintainer or an expert in the field” for each item, and we’ve had people there who aren’t core devs. For example, Pradyun was listed for pip/ensurepip in 2019, before being made a core dev this year.

3 Likes

I’ll do that, thanks. For some reason, I thought there was a licensing conflict.

Looks like this discussion is no longer about PEP 594? Can the moderator maybe split off a thread?

2 Likes

Btw there was some discussion in other threads on the best way to extract modules from the standard library into a different repo, e.g.

Thanks Damian, that’s very useful. I’ll update when I have something worth saying. I appreciate the guidance you and Victor have given. If I get ambitious, maybe I’ll even write some tests! Baby steps though.
Thank you both, and thanks to the Python community.

2 Likes

[The topic split is a bit odd: the discussion which should have been split off is the one about the idea of having “community maintainers”, but now we have a new topic, which still combines the chunk module discussion with the more general discussion around how to get more maintainers for stdlib modules. The chunk module discussion was indeed on topic for the PEP 594 topic.

Anyway, continuing here :slight_smile: ]

Ok, so in terms of process we already have a precedent where we have assigned community maintainers.

What remains to make this more popular and raise community awareness is documenting this title, how to apply as a community maintainer and then some blog post explaining all this either by the PSF or the SC.

Who would be up to working on this (except myself) ?

In my experience, the bar is too high to promote someone as a core dev. It takes months to build a trust relationship. While PEP 13 – Python Language Governance has a broader scope for core devs than “fix bugs in Python or C code”, in practice, it’s still too complicated.

Nowadays, it’s common and easy to install modules with pip. And it’s way easier to maintain anything outside Python which has heavy constraints: strict coding style, strict portability concerns, specific workflow, slow releases (once a year), etc.

For me, it’s no longer an advantage to be part of the stdlib, but a sign that a module is not going to evolve much anymore. The stdlib is a place where old elephants retire to die :joy: (Goodbye distutils, you served us well!) Well, the os module is actively maintained, and asyncio evolves quick and frequently. IMO the os module belongs to Python, it would be way less convenient if Python would not provide it.

I see a brighter future for a curate list of PyPI modules and distributions of such list for “offline” usage (people who cannot access Internet for example for technical or security reasons), than any attempt to adjust the Python core dev promotion.

1 Like

IMO, the stdlib is still important. Depending on non-stdlib modules brings a lot of baggage around virtual environment management, distribution of scripts, etc, that the packaging community hasn’t completely solved yet. Ideally yes, if using an external dependency was seamless, I could see this as a reasonable approach. But it simply isn’t the case for many users at the moment.

So while it might be a long term goal, I think “slimming down the stdlib” is still doing users a disservice in the short term. And therefore I think that there’s still an important place for discussions on how we can support modules within the stdlib better. Community maintainers would be a good idea in this context.

The distutils situation is a good example. From the core devs’ point of view it was a great move, getting rid of a package that was impossible to maintain, full of issues, and required specialist knowledge. But for the wider community, the removal has caused a lot of issues, as external projects have had to adapt to the change. I’m not saying we shouldn’t have done it, but I do think the wider community impact could have been managed better.

(By the way, I agree with @malemburg - the split is a bit odd, as I wouldn’t have thought the “community maintainers” discussion is really appropriate for the “Python Help” category. I’m only seeing this, for example, because I was involved before the split - otherwise, I have the help category muted).

3 Likes

Are you volunteering to curate such a list? The idea keeps getting tossed around, but nobody steps up to actually maintain one :slight_smile: