Getopt and optparse vs argparse

nexushoratio · November 4, 2024, 5:06pm

Regarding dd, my understanding it is was designed that way to make folks coming from the mainframe world (perhaps specifically JCL) feel more “comfortable” with the interface.

nexushoratio · November 4, 2024, 5:12pm

I have a wrapper library around argparse. By your argument, it should be removed from stdlib.

(Though after reading this thread, perhaps I should rewrite it to use optparse. Hmmm)

storchaka · November 4, 2024, 5:13pm

Tradicional amongst Unix and Linux commands. There are outliers, but most programs follow these traditions, which may even predate Unix. They usually use Posix or GNU getopt() or analogues in other programming languages. End users of the program expect a Unix/Linux behavior when they see a Unix/Linux-like CLI.

Other problem, that programmers which see a list of argparse features think that they can use them to implement some CLI, but they fail, because different argparse features not always work together. For example, you cannot implement cp-like CLI in argparse.

There are different and less strong traditions in the DOS/Windows word, and neither of the three modules allow to support them, but this is a different story.

Some users would be happy if the result of division by 0 was 0, but this does not make this idea nice.

It is good to provide an extension to the traditional interface which would catch common user errors. But first the module should support the base.

Neither argparse. Adding support for different option prefixes is the simplest feature. argparse does not support using : as the option-value separator and some other pecularities of the DOS/Windows CLI. It is equally easy to add such support in optparse and argparse. If optparse was not frozen in last 15 years, it could have most of argparse features, with preserving backward compatibility. These features that can’t be added in optparse, do not work in argparse as well.

storchaka · November 4, 2024, 5:20pm

Yet one difference between optparse and argparse – validation of the parser and options parameters. In optparse all is validated. In argparse almost nothing is validated. As result, users try to use argparse with values for which it was not designed and is not tested, and when it does not work like they expect, they open an issue. Some validations were added in the past years, as the result of such reports.

pitrou · November 4, 2024, 6:12pm

When is that an actual use case?

dg-pb · November 4, 2024, 8:43pm

Same situation here. But I will give it time and postpone this temptation until there is some resolution here.

ncoghlan · November 5, 2024, 1:09am

(about replicating getopt)

It’s less a modern use case and more “Why was this ever added to the standard library?”.

I ended up leaving it in the “Superseded” chapter of the stdlib docs in my draft PR, as I don’t have a good reason for ever using it over optparse.

I mentioned on the issue tracker that I thought we were conflating two different questions, but I’m starting to think there are actually 3:

reversing the module deprecations, and instead explaining when one might choose optparse over argparse (both of our draft PRs do this)
recommending optparse over argparse due to the better tested and more predictable, albeit also more limited, behaviour of the former (your draft PR leans more heavily in that direction than mine does)
lifting the optparse feature freeze to make writing APIs that are compliant with Unix conventions easier over time (my draft PR more strongly emphasises that the feature freeze remains in place than yours does, explicitly deferring future optparse enhancements to PyPI)

oscarbenjamin · November 5, 2024, 1:33am

If you want to use optparse you don’t need to wait for any resolution here. It is still there and I don’t see it going away.

Regardless of all the discussion about differences between optparse and argparse they are actually very similar. The characterisation above of optparse as “low-level” does not make sense in contrast to argparse since they basically have the same (clunky) interface.

The primary difference is that optparse just separates the positional arguments from the option arguments and returns a tuple of those positional arguments. You then write the code that checks how many arguments there are:

options, args = parser.parse_args()
if len(args) == 1:
    [infile] = args
elif len(args) == 2:
    [infile, outfile] = args
else:
    print("bad arguments")

With argparse you would specify the arguments infile and outfile declaratively with parser.add_argument() and it generates the error message if the number of arguments is wrong and otherwise attaches values like args.infile (you will likely end up writing more or less the same if/elif to process the output of argparse anyway though).

The characterisation that optparse does not “support positional arguments” is unreasonable: it separates and returns the positional arguments. It is not difficult to parse a sequence of positional arguments in the context of a particular script. The awkward part of argument parsing is the option arguments like --foo etc which is what optparse takes care of.

There are other differences but for most simple scripts you wouldn’t notice the difference either in terms of features or bugs. For a more complicated CLI third party libraries are better.

methane · November 5, 2024, 3:43am

Rewriting bash script using getopt into Python.

pitrou · November 5, 2024, 7:43am

No, @ncoghlan suggested the reverse: write Python code that can be rewritten in C. I’m asking when one would want to do that.

Jelle · November 5, 2024, 4:08pm

Something useful to add to the documentation would be a document explaining the differences between the three modules. The closest we have right now is Upgrading optparse code — Python 3.14.0a1 documentation, which explains how to move from optparse to argparse.

Such a guide could start with a simple example CLI (a short-form and a long-form option, some positional arguments), show how to implement it using all three modules, then explain any places where the three implementations differ in behavior. Then it could also discuss what other behaviors can only be implemented by some of the three CLI modules.

ofek · November 5, 2024, 4:34pm

Oscar Benjamin:

The primary difference is that optparse just separates the positional arguments from the option arguments and returns a tuple of those positional arguments. You then write the code that checks how many arguments there are:
options, args = parser.parse_args()
if len(args) == 1:
    [infile] = args
elif len(args) == 2:
    [infile, outfile] = args
else:
    print("bad arguments")

That’s incredibly poor UX, I’m glad we don’t recommend that module.

takluyver · November 5, 2024, 5:14pm

I agree that it’s not terribly difficult to parse positional args within a script, but I think the value of declaring them as part of the parser is that they show up in help automatically. We tend to focus on logic, but when --help is missing or incomplete I really miss it.

oscarbenjamin · November 5, 2024, 6:03pm

It is not hard to put this in the usage message (I actually prefer for this not to be automatically generated):

import optparse

USAGE = """
%prog [options] SRC DEST
%prog [options] SRC1 SRC2 ... DESTDIR

Move files from SRC to DEST, or multiple SRC files to DESTDIR.\
"""
parser = optparse.OptionParser(usage=USAGE)
options, args = parser.parse_args()

Then:

$ ./mv.py -h
Usage: 
mv.py [options] SRC DEST
mv.py [options] SRC1 SRC2 ... DESTDIR

Move files from SRC to DEST, or multiple SRC files to DESTDIR.

Options:
  -h, --help  show this help message and exit

I’m not going to write out a full demonstration here but if you try writing the same real script with optparse and with argparse you will see that you end up with basically the same code.

ofek · November 6, 2024, 1:25pm

Do you realize that automatic help generation is the preference of the vast majority of users?

oscarbenjamin · November 6, 2024, 3:35pm

It is not really automatic though: you have to specify the arguments with add_argument so that it can generate the usage message. You also still have to write out part of the usage message anyway because it is not possible to generate any explanation of what the program actually does automatically. You will also have to write some actual code with if/elif and other perfectly normal programming constructs at some point anyway regardless of whether the arguments are specified in a declarative manner for argparse.

The automatically generated usage message that comes out of argparse also includes all of the option arguments in the usage string which I would only do if there were one or two options because it otherwise obscures how the command line is normally expected to look:

$ python mv.py -h
usage: mv.py [-h] [-f FORCE] [-q QUIET] [-s FOLLOW_SYMLINKS] [-l LOG_FILE] [-p PRESERVE_TIMESTAMPS]
             [-o OVERWRITE_EXISTING]
             infile [infile ...] outfile

positional arguments:
  infile
  outfile

options:
  -h, --help            show this help message and exit
  -f FORCE, --force FORCE
  -q QUIET, --quiet QUIET
  ...

The optparse output with my hand-written usage string is different because I don’t want to enumerate the options redundantly and it is clearer to express the CLI by showing multiple usage lines:

$ python mv.py -h
Usage: 
mv.py [options] SRC DEST
mv.py [options] SRC1 SRC2 ... DESTDIR

Move file from SRC to DEST, or multiple SRC files to DESTDIR.

Options:
  -h, --help            show this help message and exit
  -f FORCE, --force=FORCE
  -q QUIET, --quiet=QUIET
  ...

It is easy to overstate the supposed benefits here. It is very common in Python that interface designers overcomplicate things that are already simple and easy like writing a simple usage string or checking how many arguments there are with if.

ofek · November 6, 2024, 4:06pm

Do you think it’s a desirable situation to have every user writing conditional branches or manually constructing a help text string with the same format as everyone else?

Jelle · November 6, 2024, 4:44pm

I think it’s time to drop this argument. Different people may have different preferences around how to write code, and that’s OK. Let’s focus on ways we can make the CPython docs and implementation better serve users.

ofek · November 6, 2024, 5:26pm

I totally agree with this which is why I am participating in this thread. There are advocates for recommending optparse rather than argparse which I believe is not in the best interest of users.

ncoghlan · November 8, 2024, 12:25pm

I really don’t want us to inflict that level of potential analysis paralysis on students working through their first ever “writing Python applications” tutorial.

At the same time, I do agree with @storchaka that there are sometimes valid reasons to want to control argument parsing behaviours that argparse doesn’t make configurable (especially where leading = and - characters are concerned).

These competing perspectives are what I’ve tried to reconcile when writing gh-126180: Remove getopt and optparse deprecation notices by ncoghlan · Pull Request #126227 · python/cpython · GitHub

I still have some suggestions from Serhiy to account for, but additional feedback would be welcome.