Should `None` defaults for optional arguments be discouraged?

mdickinson · February 5, 2023, 1:04pm

I recently merged a PR authored by Sergey Kirpichev that fixes inspect.signature (which previously failed with a ValueError) for math.log and cmath.log. A side-effect of that change is that math.log and cmath.log now accept the Python value None for the base argument. That is:

>>> from math import log
>>> log(2.3, None)
0.8329091229351039

Previously, this was a TypeError.

In comments on the original issue, Raymond Hettinger alludes to the new API as being “damaged”:

there seems to be willingness to damage the function rather than on improving the capabilities of signature objects […]

and

people should be working on that problem rather than damaging the underlying functions […]

I’m not seeing the damage here, and accepting None seems to me like a reasonable trade-off for the benefit of having usable signatures. The None default is a common idiom, and at least one other function in the math module already accepts None in this way (as do many non-math functions, of course):

>>> from math import perm
>>> perm(10, None)
3628800

More generally, I’d expect that having functions with signatures expressible in standard form would aid consistency and compatibility with other Python implementations, as well as helping tools that need to do inspection of signatures for one reason or another.

Is there a general principle that we should avoid these sorts of changes? What are the downsides of allowing None in this kind of situation?

uranusjr · February 5, 2023, 1:19pm

In theory, I wish arg being optional is less expressed with arg=None, but without a good way to express sentinels^[1], None is the most reasonable compromise to me.

Sentinel values in the stdlib ↩︎

pitrou · February 5, 2023, 1:22pm

This looks more surprising to me than simply log(2.3). The latter I parse immediately, the former it takes some seconds to wonder what that None is for.

So in this particular case of well-known math functions I think that Raymond is right.
That doesn’t mean that passing None can’t be useful (for example it probably makes writing proxies/wrappers easier), but it doesn’t result in very readable code when done directly as in your example.

mdickinson · February 5, 2023, 1:34pm

Right - I wouldn’t expect anyone to be writing code that way in practice, and it definitely wouldn’t be the recommended way to write a log with default base; the change is simply that that’s now permitted.

mdickinson · February 5, 2023, 1:55pm

To give an analogy, it’s similar to the built-in round function: it’s already permitted to write round(2.575, None), but I wouldn’t expect anyone to do so in practice, and I’ve seen no evidence that this permission causes people to deliberately write their round calls in that form. It seems to have been pretty harmless in the round case, and I can’t imagine why it would be any less harmless in the math.log case.

>>> round(3.25, None)
3

arhadthedev · February 5, 2023, 2:09pm

If the single-argument version means a natural logarithm (implying base=math.e), can we declare log as follows and not allow None?

log(x, base=math.e)

storchaka · February 5, 2023, 2:21pm

I concur with Raymond. We should accept that inspect.signature() is not able to represent signatures of all extension functions (for example dict.pop, constructors of int, str).

None is convenient as a default value in most functions implemented in Python, but it is not so conventional for functions implemented in C (it is not accepted as optional int or double argument), and in some cases passing None and not passing an argument both have valid but different semantic (for example in dict.pop).

The correct solution is to implement support of alternate signatures. For example:

dict.pop(key, /)
dict.pop(key, default, /)

str()
str(object, /)
str(bytes, /, encoding, errors='strict')
str(bytes, /, errors)

pitrou · February 5, 2023, 3:07pm

As a sidenote, I don’t understand why math.perm’s second argument is optional.

mdickinson · February 5, 2023, 3:15pm

For context, there’s some discussion at One argument form of math.perm() · Issue #81359 · python/cpython · GitHub

guido · February 5, 2023, 7:51pm

Contrary to most reactions, I find it natural and desirable to accept None as an alternative to omit an optional parameter in a wide variety of APIs. Not because I like to read or write log(2.3, None) but because I like to be able to define a simple wrapper, e.g.

def my_log(x, base=None):
    # <extra stuff here>
    return math.log(x, base)

This allows both my_log(x) and my_log(x, base) to be called, and my_log() doesn’t have to hard-code knowledge about the default base. While it is possible to write a wrapper that accepts an optional extra argument and passes that on only when present, it is uglier and harder to read, e.g.

def my_log(x, *args):
    assert len(args) in (0, 1))
    # <extra stuff here>
    return math.log(x, *args)

This is slower too (even if you take out the assert) because handling *args takes a slower path in the interpreter. And why should you have to do it this way?

(EDIT: The version with *args also makes it more awkward to access base in the wrapper in case the “extra stuff” wants to intercept a certain base.)

gpshead · February 5, 2023, 8:41pm

I agree with Guido here. There is nothing wrong with accepting None as a proxy for “the default value” on arguments where there otherwise would never be a meaningful interpretation of None.

There is very minor consequence for static analysis in that it cannot infer that the value of a variable being passed into such an API must not be None after the call as the type signature of the function must naturally be declared as float|None. But this really doesn’t matter in practice.

Even when writing pure Python code, it may seem easier to write a function with its default value in the parameter list at first… But when it is something commonly wrapped or overridden, it can be worth the extra hoop to use None as the default and add an x = "default" if x is None in the method. Just to make the lives of other wrappers or overriders easier to avoid redeclared default copying or wacky conditional hoops.

gpshead · February 5, 2023, 8:44pm

One thing we do need (which I believe has been noted above) is a way to get the type signature showing up in the docstring (and thus IDE help text as Raymond noted in the issue) to display the underlying default.

That None is accepted is more of an implementation detail, not the way you want to document the API.

storchaka · February 5, 2023, 9:00pm

This does not work for a dict.pop() and range(). It would be nice to have either a special syntax for optional parameters without default value, e.g:

def mapping_pop(mapping, key, default=?):
    try:
        res = mapping[key]
    except KeyError:
        if not isset default:
            raise
        res = default
    else:
        del mapping[key]
    return res

or a way to overload function by the number of arguments, e.g:

def mapping_pop(mapping, key):
    res = mapping[key]
    del mapping[key]
    return res

@overload
def mapping_pop(mapping, key, default):
    try:
        res = mapping[key]
    except KeyError:
        res = default
    else:
        del mapping[key]
    return res

The former option may be more convenient in many simpler cases, but the latter option is more powerful.

skirpichev · February 5, 2023, 11:43pm

It was declared in this way before and this is wrong: it makes an impression, that this is an ordinary case for a generic base (“calculated as `log(x)/log(base)” (c) rst docs).

But this is not the case: math.e is a float number, not a real number e. Same case we have, for example, in the exp() function, and that is noted in the rst docs: “Return e raised to the power x, where e = 2.718281… is the base of natural logarithms. This is usually more accurate than math.e ** x or pow(math.e, x).”. (BTW, for same reason some other docstrings/rst docs looks wrong, e.g. the exp: “Return e raised to the power of x.” Better: “Return the exponential of x.”)

The None seems to fit this case well just as for the perm: in the later function the default value depends on the first argument. In the math.log case - we can’t represent the default value by some other standard type (that might be possible with a come CAS, like the sagemath, but not with the current stdlib).

skirpichev · February 6, 2023, 12:09am

How about this fix for the math.log docstring (and for cmath.log):
Return the logarithm of x to the given base or the natural logarithm of x
(with the base=None patch, of course)?

That’s how look IDLE with this:

BTW @mdickinson, here is a new pr, that address some minor issues like above.

skirpichev · February 6, 2023, 12:11am

Unfortunately, such syntax (like c++ function overloading) isn’t possible with the CPython interpreter. I.e. I can’t reproduce pure-Python function with the current signature of the math.log, right?

tjreedy · February 6, 2023, 3:13am

I sympathize with Raymond’s objection, but also Mark’s unease.

For round, the absence of the 2nd argument is not the same as the presence of the default. A signature with ndigits=0 would be misleading.

>>> round(3.33)
3
>>> round(3.33, 0)
3.0

The summary line does not explain this difference, but the rest of the docstring does.

For math.log, the difference between no base and base=math.e is more theoretical than practical as long as log(math.e) == 1.0. But the unease is that (I think) it does not have to be, and aside from the time waste, we would not want an implementation of log to actually divide by log(e).

hugovk · February 6, 2023, 5:54am