An extension for argparse - class based API

Many other programming languages have argument parsing libraries that operate by simply writing a struct, with some additional annotations. The parser then just returns an instance of that struct. The python standard lib has argparse which is quite flexible and powerful, but it returns a very dynamic Namespace object. Users either have to live with this dynamism and loss of error checking, or manually repeat many argument names and types.

Python does have some third party argument parsing libraries like click, but they diverge a lot more from argparse, and tend to also go a step further and directly run functions for you.

I’ve been messing around and with a relatively small amount of code, I’ve been able to write a handful of functions that allow you to write things like this

@dataclass
class MyArgs:
    first_arg: int = positional()
    second_arg: float | None = option()

    ex1: ClassVar[Exclusive] = exclusive_group()

    foo: int | None = option(exclusive_group=ex1, help="this is forwarded back to argparse")
    bar: int | None = option(exclusive_group=ex1, metavar="so_is_this")

    sub: SubCommand1 | SubCommand2 | None = subparsers(default=None)

print(parse_args(MyArgs))
usage: scratch.py [-h] [--second-arg SECOND_ARG] [--foo FOO | --bar so_is_this] first_arg {sub1,sub2} ...

positional arguments:
  first_arg
  {sub1,sub2}

options:
  -h, --help            show this help message and exit
  --second-arg SECOND_ARG
  --foo FOO             this is forwarded back to argparse
  --bar so_is_this

Almost everything is handled pretty transparently via forwarding back to argparse. It simply removes some of the boilerplate involved in repeated variable names and typing information. Since it’s not much code and nearly everything is simply forwarded back to argparse (it has almost no “behavior” of its own - just a few small but significant conveniences) - I thought maybe this would be suitable to add to the argparse module.

The closest thing I found was GitHub - mivade/argparse_dataclass: Declarative CLIs with argparse and dataclasses, but it has quite a bit more boilerplate, and also does not handle things like subparsers and mutual exclusion, I believe.

I think if something like this were added to the standard library, most people who use type checkers would simply use it instead of the current argparse API.

5 Likes

There are a lot of alternate takes on command line argument parsers on PyPI already. Many of them popular. If one of them doesn’t meet your needs, adding another or improving one of those there makes sense.

We already have too many command line parsers in the stdlib (and perpetual threads about that…). The standard library is not the place to innovate in this space.

10 Likes

I think that’s kind of the point. I am not trying to innovate - most command line parsers I’ve seen depart a lot more radically from argparse (and there’s plenty of reasons for doing so - I’m not criticizing that). I’m just wrapping the API in the most obvious improvements based on typing and dataclasses - it’s not opinionated at all, and anyone who understands argparse will fully understand how to use this (including advanced functionality) in two minutes.

You wrote we have too many command line parsers - but this is not a new command line parser. It’s just an alternate API to argparse. I don’t think any of this is very similar to what most third party command line argument parsers are trying to do.

And I think the reality is that plenty of people still use argparse simply because it’s there in the standard library, and they’ve used it before. This is about providing an easy, moderate improvement to that “default” choice of argument parser - not innovating or coming up with a new paradigm for argument parsing.

4 Likes

I like the combination with dataclasses, especially if it simplifies subparsers and commands. I’m interested in using this if you put it on PyPI. I find existing third party frameworks a bit too auto-magical, decorator-heavy, and bloated.

3 Likes

That IS innovation, though :slight_smile: Most of the tools built on argparse do fairly straight-forward things, but they’re still beneficial.

What you should be able to do is make a package that you publish on PyPI that does the augmentation. In fact, if you wanted to, you could even make it completely backward compatible, and then recommend that people use import argparse_classbased as argparse to use it.

2 Likes

As an aside, I’d recommend taking a look at Cappa, which follows this declarative approach and does that extremely well.

3 Likes

Your proposed extension looks beautiful to me.

I agree that it isn’t a new parser, it is just a more convenient front-end much like the various ways to configure logging: dictConfig, fileConfig, etc.

Thanks! Those are great examples, I was trying to find alternate examples in the standard library and not succeeding.

I don’t really want to nitpick over the definition of innovation. Almost all the tools being discussed here for command line argument parsing don’t have goals that are anything like what I’m proposing here, like total compatibility with argparse, an API where nearly all keyword arguments have the same meaning, exposing the argparse API directly in conjunction with whatever other API is being offered, etc. So it’s not really relevant to bring them in - that’s really what I was trying to get at with my comment about innovation. Hopefully that’s more clear.

Thanks for the encouragement! I will definitely be looking to put a 0.1 release on pypi soon. If nothing else, I think it’s necessary for people to understand the scope and API of what’s being proposed (and in particular, just how small this is - we’re talking about a few hundred lines of python that could be added to argparse to significantly improve usage).

Thanks, I checked it out. I notice that Cappa uses the annotated approach to carry the metadata for each parsing field, as opposed to the “field-like” approach I use (for lack of a better term). I considered both approaches, and couldn’t find too much of a technical reason to prefer either. I ended up with the approach I showed above as it keeps the type annotation clearer. Do you have a preference or any thoughts between the two approaches?

Agreed that debating what constitutes “innovation” is probably not helpful.

But from a purely practical point of view, if what you want is to make your API available for others to use, the best route is almost certainly to publish it as a library on PyPI. That can happen right now, and users on older versions of Python can benefit. Trying to get it added to the stdlib will take some time, won’t help users of Python 3.14 and earler, and is actually pretty unlikely to succeed (innovation or not, there’s little or no appetite among the core devs for adding yet more APIs for argument parsing).

Of course, if you simply don’t want to put the effort into publishing and maintaining your own package, that’s perfectly fine.

I will probably start with a pypi package - at a practical level it seems like there needs to be some implementation experience and to even propose I need to be able to quickly give a concrete sense of the API. One person I spoke to for example told me they were initially mildly against the idea, but went to mildly pro when they realized the entire API being added would probably amount to ~5 functions/classes, which consisted of more type stubs than implementations.

I do still hope to get it into the stdlib someday, because the reality is that a lot of people just use what’s in the standard library if it’s vaguely reasonable. That’s certainly what I did with argparse and that in my personal experience is what most people do - most people don’t have very deep feelings about argument parsing and (frankly) don’t really care about the opinionated back and forth presented by a lot of these third party libraries. But a lot of folks would like better type checking and a little less boilerplate, without having to commit to a third party library and vetting its quality and so on.

3 Likes

The approach Cappa takes is to “stay out of your way”. It doesn’t modify your dataclasses in any way that would prevent using them as such. You could replace the @cappa.command decorator and Cappa annotations with dummy ones and your dataclasses would still work perfectly. That’s well illustrated by your annotations vs. custom field values question, as not using custom field values lets you assign the actual default value of a field:

class Command:
    option: Annotated[int, cappa.Arg(...)] = 0
    # opposed to `option: int = custom_field(default=0)` or similar