Getopt and optparse vs argparse

storchaka · October 29, 2024, 10:39pm

Last month I worked on the argparse module (not only on it). At the beginning, there were about 185 open issue, almost half of them were bug reports. Now the total number of issues has been reduced by half, and the number of bugs has been reduced by four. This is in additions to issues I solved at the beginning of the year. Many remaining issues are difficult to resolve without changing the foundation of the module. argparse has many features, but they not always work together. The behavior of the CLI built with argparse differs from the behavior of most Posix and GNU tools, to the degree that these peculiarities can be qualified as bugs. And this is deep in the design of argparse. When you build a simple interface with argparse, the behavior can be unexpected in corner cases, without possibility to configure. And when you want to implement more complex interface, you have more chance with getopt or optparse. In many case the answer – this is impossibly with argparse, use getopt or optparse.

But getopt or optparse are “soft-deprecated” now, despite they are solid and mature, and do not have so much bugs.

I suggest to undeprecate getopt and optparse. getopt should never be deprecated. It is very simple, but it is a standard. It allows to implement behavior of standard Posix and GNU tools. Programmers which came from other programming languages are usually familiar with it. optparse has different, object-oriented API, but the CLI is still well predicable and consistent with other programs. It should be the recommended module until we solve all fundamental argparse issues.

I am going to implement special modes for argparse which would make its behavior more standard, but many features will not work in these modes or work differently. It may take years before we could change its default behavior and solve new issues caused by this. I would rather assing a “provisional” status to argparse than deprecate getopt and optparse.

tjreedy · October 30, 2024, 5:49am

IDLE currently uses getopt. Ignoring the deprecated optparse, I considered using argparse to get more readable and testable code. But the open issues have discouraged me a bit. I just read enough of the optparse doc and discovered that it seems adequate for IDLE and that argparse seems like a superset, with extended features that I mostly (all?) do not need. So if optparse is undeprecated, I would likely give it a try.

ncoghlan · October 30, 2024, 8:12am

The click project also has a couple of stated reasons for why it is based on optparse internally rather than argparse: Click - Why not argparse?

They both relate to the fact that argparse needs to parse the entire command line in one go, otherwise its behaviour isn’t well-defined. Because optparse is more explicit about what is going on, you can feed it fragments of a command line and it will still process them correctly.

If I was restricted to writing a CLI with just the standard library (that is, I couldn’t reach for click or Typer instead), I’d probably still reach for argparse over optparse, but I agree “deprecated” isn’t the right way to describe the status of getopt and optparse.

For getopt, I think the note at the top of getopt — C-style parser for command line options — Python 3.13.0 documentation is sufficient on its own, and the soft deprecation notice could just be dropped without any further changes.

For optparse, the soft deprecation notice at the top of optparse — Parser for command line options — Python 3.13.0 documentation would need to be replaced with a new note, especially if we retain the optparse feature freeze, and keep the module focused on serving as a foundation for third party argument parsing libraries, rather than advocating for people to use it directly.

(Personally, I’d also like to see “See also” notes for click and Typer in the docs for both argparse and optparse, similar to the tomli-w and tomlkit recommendations at the start of tomllib — Parse TOML files — Python 3.13.0 documentation).

malemburg · October 30, 2024, 9:19am

+1 on this.

I never understood why we have to (soft-)deprecate a 215 line Python code module, which doesn’t need much maintenance.

I’ve been using getopt in PyRun to match Python’s command line parsing for many many years, without any problems.

vstinner · October 30, 2024, 10:50am

Do you have examples of such issues?

storchaka · October 30, 2024, 11:48am

Usually, if option -f has a required parameter, -f -b means option -f with argument -b. But in argparse this is an error, missed argument for option -f. You should write -f-b (without space) to pass argument -b for option -f. But this does not work if the value starts with =, because -f=-b mens option -f with argument -b. You should write -f==-b or -f =-b to pass argument =-b for option -f. The latter does not work if the value starts with -, and the former is unexpected, because it differs from the behavior of other Unix/Linux programs.

Also, the workaround dous not work at all for multi-argument options.

This is at the core of argparse – they assume that the user missing an argument for option is much more probable that the user passing an argument that starts with -. argparse does not work without such assumption.
Usually, if option -f has an optional parameter, -f b means option -f without argument followed by a positional argument b, but in argparse it means -f with argument b.
In argparse you can interleave options and positional arguments, but not always. For example, in cp-like program you can run cp file -f target, cp -f file1 file2 target, cp file1 file2 target -f, but not cp file1 file2 -f target. The latter parses the second positional argument as target and the last argument as unrecognized.
There are no any tests for this, but in argparse you can add positional arguments after subparsers. I am sure that this is used in user code because there are many reports for the case when it does not work. For example, variable-argement positional parameter never works. Also, subparsers should consume variable number of arguments.

These are just some examples which for sure depends on the argparse design. You can find more on the tracker. Some issues were reclassified from bug reports to feature requests because even if the current behavior is meanless or frankly wrong, it cannot be changed in bugfix releases.

ZeroIntensity · October 30, 2024, 1:20pm

I’d like to hear the historical reason why optparse was deprecated in the first place. Was it just because argparse was the hot new thing?

AA-Turner · October 30, 2024, 1:26pm

PEP 389 § Deprecation of optparse

mwichmann · October 30, 2024, 2:36pm

The combination of allowing multiple option-arguments (nargs > 1), space-separated option-arguments, and interleaving, makes some scenarios very complex (this is something in common between argparse and optparse, not an argparse-specific wart). Maybe that is just too much flexibility? It would have been simpler if some things @storchaka lists as issues just weren’t allowed: e.g. option-arguments beginning with a dash could have been prohibited, and for cases like optional arguments (nargs == '?') and multiple arguments you need to use = to introduce the opt-args and use a comma-separated word for multiples rather than multiple words ( --args foo bar baz prohibited if you meant --args=foo,bar,baz). Well, that’s just noodling, we can’t change those things now in either module.

barneygale · October 30, 2024, 4:42pm

I know it will take longer and be more difficult, but IMHO it’s better to execute your plan without marking argparse as provisional or un-deprecating optparse. Deprecating and un-deprecating optparse sends a confusing message to users, especially as these bugs have existed the entire time that argparse has been in the standard library. I agree the bugs are bad - and only get more terrifying when you debug them and realise how argparse really works - but they’re edge cases that don’t affect most users, and often have simple workarounds, like re-arranging or quoting arguments. Just my 2c.

(Admittedly I quite like argparse and I’ve been using it since before it was in the standard library, because its interface is better than getopt, and it parses positional arguments unlike optparse.)

Let me know if I can help write some tests or review some changes!

jamestwebber · October 30, 2024, 5:06pm

I agree with this–it would be a shame if the outcome of this thread was two frozen modules and one marked “provisional”.

Nodd · October 30, 2024, 5:36pm

Serhiy Storchaka:

Usually, if option -f has a required parameter, -f -b means option -f with argument -b. But in argparse this is an error, missed argument for option -f. You should write -f-b (without space) to pass argument -b for option -f. But this does not work if the value starts with =, because -f=-b mens option -f with argument -b. You should write -f==-b or -f =-b to pass argument =-b for option -f. The latter does not work if the value starts with -, and the former is unexpected, because it differs from the behavior of other Unix/Linux programs.

Also, the workaround dous not work at all for multi-argument options.

This is at the core of argparse – they assume that the user missing an argument for option is much more probable that the user passing an argument that starts with -. argparse does not work without such assumption.

I had this problem multiple times when passing negative numbers as arguments. It’s a pain to explain the workarounds to users.

storchaka · October 30, 2024, 5:42pm

argparse was added 15 years ago, and all these 15 years it was pushed as the preferable solution, while getopt and optparse were lowered. It may take another 15 years (I hope we will manage in 10 years) to make it working right. The result will look more like improved optparse, it will not be compatible with the current argparse (this is for what 10 years of gradual changes are needed). Until then, the answer to many problems will be “use optparse or getopt”.

Also, even if argparse will be fixed, there is no reason to deprecate getopt. It has its niche.

Let me know if I can help write some tests or review some changes!

I will rely on you.

storchaka · October 30, 2024, 5:50pm

It is so now in argparse. And this is a problem, because other programs allow option-arguments beginning with a dash. You can get unexpected error and you are forced to look for a workaround. Python programs that use argparse look buggy or inferior compared to others programs. This casts a shadow over Python as a whole.

pf_moore · October 30, 2024, 7:16pm

For what it’s worth, pip uses optparse, and is unlikely to change unless/until we adopt click - but there’s no immediate plans to do that, it’s just an idea at this point.

BrenBarn · October 30, 2024, 7:45pm

I think there are maybe two dimensions to this. One is that argparse makes some tasks hard, but the other is that the documentation does not foreground the assumptions or limitations of the module.

That actually seems like a fairly reasonable default assumption to me. The problems are, first, that there seems to be no way to turn off that assumption (i.e., some kind of “raw mode” option); and second, that the documentation of this is buried way down inside the description of parse_args.

I wonder how much would be gained by having the argparse documentation state at the beginning something like “This module is not suitable for applications that need to parse arbitrary argument values that may begin with -.” At least that way people wouldn’t get partway in and only later realize that argparse limits what they can do.

Either way, I agree that the “deprecations” for optparse and especially getopt could probably be made even a bit softer. Like maybe instead of just saying “this is for people who know C getopt” it could add “or those who need more flexibility than what argparse provides and are prepared to accept less out-of-the-box functionality as a tradeoff”.

pf_moore · October 30, 2024, 8:06pm

That would only result in people asking what is the module that’s suitable for that use case^[1]. I’d be OK with this if getopt and optparse were undeprecated.

Replicating the CLI of an existing POSIX-style utility is a perfectly legitimate use case ↩︎

ncoghlan · October 31, 2024, 12:23am

Undeprecating really does seem like the right option to me. These things happen, and it is better to explain them as clearly as we can than to continue to make inaccurate statements in the documentation. The current deprecation notices are inaccurate:

neither getopt nor optparse are in any danger of removal
getopt is a better choice than argparse if you’re aiming to emulate C getopt functionality
optparse is a genuinely better choice than argparse if you’re concerned about the parsing consistency issues that Serhiy mentioned here, or the composability limitations mentioned in the click docs

When PEP 389 was accepted, argparse was genuinely believed to be the better choice for all potential users. It took time to realise that the trade-offs between the different modules were subtler than that.

If PEP 389 were written today, I doubt we would accept it. However, Python packaging was in a much worse state in 2009 than it is now (pip was only released in 2008, and didn’t reach 1.0 until 2011. setuptools didn’t reach 1.0 until 2013), so the argument in favour of more capable stdlib argument processing was more compelling.

savannahostrowski · October 31, 2024, 1:05am

While I know that you and I have been plugging away at argparse issues for the past month or two, one other angle that hasn’t been discussed here is that there are no current “official” maintainers for argparse. Since the surface area of the optparse and getopt modules is fairly small, we’d probably be in no worse a situation than we are now in terms of maintenance but I just wanted to mention this angle since we are potentially adding more to an “understaffed” area.

csm10495 · October 31, 2024, 3:10am

I think the logic of less bugs in the currently deprecated modules are a bit off in that their usage has been downplayed for a while now.

If it wasn’t downplayed I’m sure something (bugs or features) would have come in.

As argparse is the only actively supported/recommended at the moment: it would have more bugs/issues reported.

I wouldn’t really want us to reverse and deprecate argparse after recommending it for so long… but having 3 ways to ‘parse args’ all supported seems a bit against the one obvious way to do things.