Getopt and optparse vs argparse

Github search:

  • import getopt - 100K files
  • import optparse - 124k files
  • import argparse - 3.5M files
5 Likes

Adding to the Python GitHub search results:

  • import click - 327k files

(for how new it is, even Typer manages a respectable 45k)

2 Likes

This is basically what I was going to come here to write. As someone who maintains a significant number of CLIs, I feel very strongly that we should maintain a single argument parser in the standard library and it should be based on actual usage. Iā€™ve been in this community for over a decade and Iā€™ve never seen code in the wild (other than pip) using those other two options nor tutorials.

We should recommend third-party libraries like Click in the documentation as we do for HTTP functionality, and TOML as Alyssa mentioned.

The shadow you refer to pales in comparison to the potential one produced by having three suboptimal options in the standard library. We should recommend third-party libraries and continue our reduction of standard library modules, thatā€™s the solution.

I agree with this completely, both regarding the bugs being edge cases and the general favorability of argparse. Itā€™s easy to use, supports positional arguments, has fantastic documentation, and is familiar to most people because thatā€™s just what everyone uses and writes tutorials about.

Why is there a desire for something niche in the standard library? Seems like a fantastic use case for a third-party library.

I want to be clear that I very much appreciate your work on this and more broadly as a core developer! At the same time, that statement gives me pause. I feel very much like we are in two different worlds. To me that sentence is exactly the same as saying that the re module is insufficient because it doesnā€™t support POSIX character classes like [[:digit:]], without acknowledging that the vast majority of users are content.

If you feel very strongly that those other two should be brought back then I would say a better option is to deprecate the remaining one. I donā€™t agree with that approach of course, as I prefer that we simply recommend third-party libraries as we usually do.

7 Likes

@storchaka and I have a couple of alternative PRs up considering potential options for reversing the deprecation of optparse (and bringing the status of getopt back into line with the PEP 389 recommendation to leave it undeprecated and just note that it serves a very specific purpose in emulating C application behaviour). (Both PRs are flagged as DO-NOT-MERGE for now, since the discussion of whether or not the deprecations even should be reversed is still ongoing)

Serhiyā€™s is a relative minimalist change, mostly just restoring the relevant docs structure that existed in Python 3.12: gh-126180: Undeprecate getopt and optparse by serhiy-storchaka Ā· Pull Request #126186 Ā· python/cpython Ā· GitHub

Mine takes a different perspective, explicitly positioning the 2 libraries as ā€œargparse is for writing command line applications, while optparse is for implementing command line argument processing libraries (and potentially for working around certain argparse limitations if they affect your application)ā€: gh-126180: Remove getopt and optparse deprecation notices by ncoghlan Ā· Pull Request #126227 Ā· python/cpython Ā· GitHub

For those advocating that the formal deprecation of optparse should stand:

  • if youā€™re writing your own command line argument processing library, you probably should be using optparse (not argparse), for the same reasons click does so
  • if youā€™re accepting command line options with parameter values where - is likely to be present as the first character, the way argparse works is probably going to cause you problems, and youā€™ll likely be better off using either optparse or one of the third party libraries based on it (like click or Typer)

The situation with getopt is different. While it wouldnā€™t pass muster as a new standard library addition these days, it also isnā€™t causing any maintenance hassles, so thereā€™s no reason to disrupt the people that are using it, and if your goal is to emulate the behaviour of what C getopt can do, itā€™s a better option than even optparse. (I may move it back to the ā€œSupersededā€ section of the docs in my draft PR, though. The use cases for it really are far more obscure than the use cases for the other two modules).

2 Likes

This may be partially true, but the main reason is that they have less features and simpler internal logic, so it was faster to sweep out bugs. The last real bug report for getopt was in 2008 (fixed in 2010). All following bug reports were invalid ā€“ just misunderstanding from the user side. On other hand, argparse has design flaws. Its bugs are so long living because it is not easy to fix them while keeping most of other features working.

In foresight, argparse should not be added to the stdlib. It would be better to develop existing optparse than add a new module with similar but different API and complete different logic. getopt and optparse do not concur ā€“ one provides a low level API similar to APIs in other programming languages, and other provides an object-oriented extensible API.

What done is done, and I do not want to remove any of the existing module from the stdlib. This will only distrupt the use code. But argparse needs a lot of work. It was mistake to recomend it. We should not downplay other modules. Migrating from getopt or optparse to argparse will now cause more headaches than benefits.

No, this is like having an alternative to the re module in the stdlib, which supports POSIX character classes and many other advanced features, but with caveats ā€“ all quantifiers are non-greedy, backslash cannot be used in the character set, backtracking works only at one level, and capturing groups do not work in alternations. It may work identical to other regular expressions in other programming languages, but will shoot you in the foot more often than you want it.

7 Likes

Searching the top 8k PyPI projects (the text below wrapped in regex \b) gives similar proportions:

  • import getopt - Found 256 matching lines in 117 projects
  • import optparse - Found 369 matching lines in 190 projects
  • import argparse - Found 5308 matching lines in 1273 projects

Third-party:

  • import click - Found 3018 matching lines in 332 projects
  • import typer - Found 996 matching lines in 38 projects
7 Likes

I would like to ask this question again because I think itā€™s incredibly important:

2 Likes

There have been a few suggestions that the documentation should recommend third-party solutions but I want to point out that outside of simply being included in the stdlib, one of the major advantages argparse has is that it has a significantly lower import time than the alternatives[1].

This is to the point where a simple argparse based tool can import, parse arguments, execute and exit before many of the alternatives have finished their initial imports.

With ā€œpassā€ as a baseline.

Summary
  python -c "pass" ran
    1.23 Ā± 0.06 times faster than python -c "import argparse"
    2.03 Ā± 0.10 times faster than python -c "import datargs"
    2.05 Ā± 0.10 times faster than python -c "import click"
    2.22 Ā± 0.11 times faster than python -c "import pyrallis"
    2.40 Ā± 0.12 times faster than python -c "import tap"
    3.20 Ā± 0.16 times faster than python -c "import simple_parsing"
    3.89 Ā± 0.18 times faster than python -c "import tyro"
    5.10 Ā± 0.24 times faster than python -c "import cappa"
    5.21 Ā± 0.25 times faster than python -c "import typer"
Full timings
Benchmark 1: python -c "pass"
  Time (mean Ā± Ļƒ):      27.4 ms Ā±   1.2 ms    [User: 11.7 ms, System: 12.6 ms]
  Range (min ā€¦ max):    25.5 ms ā€¦  30.9 ms    20 runs

Benchmark 2: python -c "import argparse"
  Time (mean Ā± Ļƒ):      33.5 ms Ā±   0.6 ms    [User: 21.5 ms, System: 13.8 ms]
  Range (min ā€¦ max):    32.4 ms ā€¦  34.7 ms    20 runs

Benchmark 3: python -c "import click"
  Time (mean Ā± Ļƒ):      56.0 ms Ā±   1.0 ms    [User: 29.3 ms, System: 21.6 ms]
  Range (min ā€¦ max):    54.5 ms ā€¦  58.0 ms    20 runs

Benchmark 4: python -c "import typer"
  Time (mean Ā± Ļƒ):     142.7 ms Ā±   2.2 ms    [User: 99.5 ms, System: 36.6 ms]
  Range (min ā€¦ max):   139.1 ms ā€¦ 145.9 ms    20 runs

Benchmark 5: python -c "import cappa"
  Time (mean Ā± Ļƒ):     139.5 ms Ā±   2.1 ms    [User: 98.0 ms, System: 40.5 ms]
  Range (min ā€¦ max):   137.5 ms ā€¦ 146.5 ms    20 runs

Benchmark 6: python -c "import datargs"
  Time (mean Ā± Ļƒ):      55.5 ms Ā±   1.0 ms    [User: 34.7 ms, System: 19.0 ms]
  Range (min ā€¦ max):    54.0 ms ā€¦  58.7 ms    20 runs

Benchmark 7: python -c "import pyrallis"
  Time (mean Ā± Ļƒ):      60.8 ms Ā±   0.8 ms    [User: 37.1 ms, System: 20.5 ms]
  Range (min ā€¦ max):    59.6 ms ā€¦  63.0 ms    20 runs

Benchmark 8: python -c "import simple_parsing"
  Time (mean Ā± Ļƒ):      87.6 ms Ā±   2.1 ms    [User: 53.4 ms, System: 29.4 ms]
  Range (min ā€¦ max):    85.5 ms ā€¦  91.9 ms    20 runs

Benchmark 9: python -c "import tap"
  Time (mean Ā± Ļƒ):      65.7 ms Ā±   1.3 ms    [User: 45.6 ms, System: 19.3 ms]
  Range (min ā€¦ max):    64.3 ms ā€¦  68.1 ms    20 runs

Benchmark 10: python -c "import tyro"
  Time (mean Ā± Ļƒ):     106.6 ms Ā±   1.3 ms    [User: 73.0 ms, System: 30.5 ms]
  Range (min ā€¦ max):   104.0 ms ā€¦ 108.8 ms    20 runs

This is before doing any parsing or any other work for your application.


  1. At least all of those that Iā€™m aware of. ā†©ļøŽ

8 Likes

Very true, thatā€™s often why I use it! Also it seems like itā€™s been optimized for import time versus the proposed alternative:

1 Like

Why is there a desire for something niche in the standard library? Seems like a fantastic use case for a third-party library.

I fully agree.

If there are bugs or missing features in argparse, they should be fixed (unless these features revolve about ambiguity like nargs="*" for options, because ambiguity should be avoided in CLI design and not encouraged)

If there are niche use cases that argparse will never be able to deal with, people should gather around a 3rd party module that implements an API for that niche.

2 Likes

Why in the world we should do it, when for 99.9% (OK, 90% to be tolerant) of use cases of argparse are just fine? Removal of getopt and optparse, thatā€™s a different thing, one I could happily stand behind.

1 Like

I think optparse has a couple of extra imports, it also depends on locale and textwrap. This makes the import a little slower than argparse, but not on the same level as some of the other packages.

argparse/optparse comparison
Benchmark 1: python -c "pass"
  Time (mean Ā± Ļƒ):      26.2 ms Ā±   0.7 ms    [User: 15.4 ms, System: 11.2 ms]
  Range (min ā€¦ max):    24.9 ms ā€¦  27.4 ms    20 runs

Benchmark 2: python -c "import argparse"
  Time (mean Ā± Ļƒ):      33.7 ms Ā±   0.8 ms    [User: 18.7 ms, System: 15.0 ms]
  Range (min ā€¦ max):    32.8 ms ā€¦  35.5 ms    20 runs

Benchmark 3: python -c "import optparse"
  Time (mean Ā± Ļƒ):      35.0 ms Ā±   0.9 ms    [User: 17.7 ms, System: 17.4 ms]
  Range (min ā€¦ max):    34.0 ms ā€¦  37.2 ms    20 runs

Summary
  python -c "pass" ran
    1.29 Ā± 0.05 times faster than python -c "import argparse"
    1.34 Ā± 0.05 times faster than python -c "import optparse"

I think it might be useful to give some time for more people to look at argparse issues. Maybe someone can come up with new solutions to address its shortcomings? Even if it would break things, it might be worth it given then benefit of not having 2 modules that are very similar.

ā€œNicheā€ here does not mean something used by a handful of users. This means that getopt firmly occupies a place that cannot be occupied by something else. This is a part of a niche larger than Python. getopt is a good entry point for users already familiar with getopt in other programming languages. It helps to port programs to Python. I do not understand desire to exclude it from the stdlib.

All this shows that we did a great job with promoting argparse.

Yes, it was optimized. See Reduce the number of imports for argparse Ā· Issue #74338 Ā· python/cpython Ā· GitHub.

4 Likes

You are welcome.

If you break enough things to fix all argparse issues, you will get optparse, just with different name. This is my plan actually, but it will take many years to do this right.

As long as this is the goal and direction is clear, I like your proposal (regarding optparse/argparse). After all, if to bring argparse to certain completeness breaking changes are inevitable, then it is very useful to make it provisional.

I wonder what proportion of new Python programmers are familiar with getopt from another language (mostly C?). These days I suspect thatā€™s pretty rare!

It also shows that the issues with argparse are not deal-breakers for the majority of usersā€“at least, they are outweighed by the ease of use.

10 Likes

I agree 100%. These modules are stable and still have their uses.

3 Likes

ā€¦

Since we now live in a post-packaging world where you can install things easily and we have all come to accept that the stdlib doesnā€™t need to have everything in it and it wonā€™t always be the best for everyoneā€™s use-case, then trying to cater to every CLI API use-case in the stdlib seems unnecessary.

To me, that suggests we choose a ā€œwinnerā€ CLI module and do what we need to do to make maintaining it easy.

Unfortunately, with the install numbers people are posting, it seems to clearly be that dropping argparse will break too much. But it might not be the end of the world to drop getopt and optparse (but Iā€™m known for wanting to slim down the stdlib, so I have a bias :grin:).

So with that in mind, I like Serhiyā€™s plan of cleaning up argparse and eventually making it the sole CLI module:

3 Likes

Taking pip as an example of a popular tool that depends on optparse and assuming that you donā€™t want to have a tool like that vendor itā€™s own copy of optparse, with all the subtle issues that might introduce. Then the timeline would have to look like this:

  • Improve argparse across multiple Python versions

  • Have an announcement that argparse is a viable alternative for tools like pip

  • Wait until all versions of Python where argparse isnā€™t good enough to become EoL

  • Allow tools like pip to transition to new argparse and hope they donā€™t find some show stopping issue

  • Concurrently, deprecate optparse with a target that allows tools to transition

Optimistically this would be ~8 years? To me, it seems more realistically 10 to 15 years.

3 Likes