Adopting/recommending a toml parser?

I think there are still real advantages to having some batteries included, and a file format which we’re using in standards for packaging seems like an excellent candidate. Installable packages play a much bigger role now, but I hope the core developers don’t decide nothing new can be added to the standard library - that would be throwing the baby out with the bathwater.

5 Likes

I didn’t mean that there should be no batteries at all. I’m wary of the notion that “best of breed” libraries should be adopted into the standard library, though.

The bootstrapping requirement for package installation tools (as opposed to build tools & other development tools) is a significant concern, and should guide us in deciding what goes into a minimal installation.

I do believe we should keep in mind that bootstrapping a tool like pip isn’t the same as making all the batteries it depends on part of the standard library, though. A pip that gets packaged with Python does not have to share the pieces that it would otherwise vendor.

Of course, I may have some unusual views on the separation of the interpreter and applications built using the interpreter.

-Fred

1 Like

The TOML specification has now reached version 1.0!

I hope it will be added to the standard library of Python 3.10.

3 Likes

Need to have a (probably massive) discussion about the future of the stdlib, then I will work on getting TOML into the stdlib somehow if it still makes sense.

2 Likes

The second part of the sentence seems to spill the beans about the outcome of the massive discussion :wink:

1 Like

:laughing: I’m actually not going into that discussion with any preconceived notions. I should rephrase it to be more like, “if it still makes sense to pull in a TOML module, I will work on that.”

1 Like

Just to metion rtoml (implemented in Rust), which had a v0.6.1 release 14 days ago.

A Rust implementation might be hard to take since it would not only be a burden on alternative implementations, but that there’s currently no Rust in CPython.

1 Like

From somebody who has supported the Rustification of a popular Python package: you don’t want to go through that amount of hate and abuse.

8 Likes

@tiran FWIW PEP-656 is in, so this should not be that big of a deal in the future :wink:

Not really. :frowning: The muls libc tag will only appease a small set of users.

1 Like

And it especially won’t help CPython, since it will be those who compile everything from source who’d have an issue with Rust…

2 Likes

Can someone explain what musl has to do with Rust?

1 Like

Only that packages built using rust can now ship wheels for the musl architecture, so people installing via pip on those platforms can get wheels rather than needing Rust to build from source.

For the question of recommending a TOML parser, it’s mildly relevant as we want one that has a low barrier to entry - but honestly, it would have to be insanely good for it to be preferable to a pure Python library anyway.

For something that might end up in the stdlib, it’s completely irrelevant as we’re not likely to require rust to build parts of the stdlib any time soon. Pure Python (or just maybe C) is the only real option.

3 Likes

It appears that the toml package on PyPI (uiri/toml on Github) has become the de-facto standard for tools that just read toml files (including pip, pytest, black, towncrier, flit…). tomlkit specialises in programatically modifying toml files without destroying formatting and comments, for tools like poetry.

Some of our projects have shifted from pytoml to toml because the former was unmaintained. However, toml also seems to be borderline unmaintained - the last commit was in November 2020, there’s no changelog or release notes, and issues and pull requests aren’t getting responses.

I suspect XKCD 2347 roughly illustrates what we’ve done to William (uiri). I don’t know him, but in all likelihood he put together a TOML parser in Python as an interesting hobby project. Now it’s become the go-to library, we’ve standardised pyproject.toml, and his package is being downloaded a million times a day as a dependency of half the Python ecosystem.

Before long we will probably need to adopt or fork one of these TOML parsing packages. It’s not urgent - we’re not getting a flood of bug reports related to TOML parsing - but it seems clear that what we’ve got is not really sustainable.

6 Likes

I’d personally vote on tomlkit over toml just for being better maintained and is machine writeable. In my book, offers better features and is more stable.

1 Like

I’m reluctant to depend on tomlkit precisely because it does so much. Perhaps that’s silly, but I wonder if we should go to the other extreme, and make a parsing-only library with no functionality to write TOML at all.

5 Likes

I’ll note that tomlkit is v1.0.0 compliant (technically RC1, but the differences seem to be entirely clarifications in the spec). The toml library is only v0.5.0 compliant. Do we want to take on the job of updating it? Or do we want to end up recommending a library for an out of date version of the spec for the stdlib?

From the original post:

Regardless, if we’re thinking of taking any TOML library under the PyPA banner and recommending it for the stdlib, we should make sure that the author is OK with that before proceeding…

In practice, adding TOML to the stdlib would definitely be the subject of a PEP, and while the proposal might be based on an existing library, it’s quite likely that the exact API could change as a result of the PEP process.

1 Like

@pradyunsg and I have exchanged emails with William once, we have not heard from him since.

It hasn’t gotten a code-related commit since July and I have a PR from September for adding a py.typed file that’s still open, so I don’t know where its maintenance stands.

It’s definitely an idea since how to format a configuration file can be very touchy.

Do note that I am poking the bear next week at the language summit when it comes to the stdlib, so things might be a bit turbulent for changing the stdlib for the short-/medium-term.

I will also say one of the maintainers of TOML itself is here, so I’m sure we can figure something out. :wink:

And lastly, TOML has a ABNF grammar file, meaining we could probably very easily generate a parser, at which point it’s just constructing the objects (especially if we don’t worry about writing out a TOML file).

7 Likes

We don’t need to do formatting. The whole point here would be to keep the existing format. Anyways, for me personally if it doesn’t handle write I don’t care about it, so pick whatever. Because then I’ll just need to pull in a 3rd party either way to give a good user experience.

1 Like