>>> class A:
... def lol(a):
... print(a)
...
>>> A().lol(1)
Traceback (most recent call last):
File "<python-input-1>", line 1, in <module>
A().lol(1)
~~~~~~~^^^
TypeError: A.lol() takes 1 positional argument but 2 were given
Can we catch this at class definition time? if the positional argument isn’t named self and there is no classmethod | staticmethod decorator, I’d expect something like “first argument must be self or cls or @staticmethod is expected”.
Though it is a convention to call the first argument self, AFAIK, the Python interpreter doesn’t enforce that. All it knows is that a bound method call, which automatically prepended the instance to the arguments, passed an unexpected number of positional arguments. Catching it at class definition time would likely break a lot of existing code. Also, if you use a linter or type checker, it would probably tell you the first argument of an instance method should be called self. This is rule N805 in Ruff, for example.
We can’t catch it at class definition time because the language does not require that you use the name self. However, the error message could be improved; we say “takes 1 positional argument but 2 were given” when it looks like exactly one was passed. If we detect that we’re calling a bound method and there are too few positionals, the error could mention that one of the bound arguments is the self argument.
Currently, we state the number of declared parameters, which includes the self-reference. One could mention that one of them is self and not supplied by the user. But that’s quite technical and irrelevant / difficult to understand for many users. If we can detect that a bound method is called, can we instead reduce the stated number of expected arguments when called.
In the simplest case, the message would be left as is but with a reduced count; in the example above “takes 0 positional arguments but 1 was given”.
If that is too sloppy wrt. to the definition, rephrasing in terms of expeceted arguments on call could be done “Calling bound method A.lol() requires 0 positional arguments but 1 was given.”
TBH I’d leave this out; forgetting self is only one of many ways that this could happen. In order to catch that scenario, a better option would be for your editor or linter to detect it, since heuristics like “the first argument is not named self” are completely reasonable and acceptable in linters.
Agreed. There are two possible error to end up with an invalid number of arguments.
Improper method definition (self forgotten)
This cannot be handled on the language level and must be left to linters. It’s also an author error. Assuming that developers who write classes have a certain level of understanding and should test their own code, they will manage.
Users calling with the wrong number of arguments
This should be much more common in the wild and the one we should primarily focus on.
To present the opposite view, I find the “Perhaps you forgot self” quite useful and the “perhaps” sufficient not to rule out other causes. This error occurs often during tests, in “quick and dirty” helper scripts written just to get a task quickly done, etc - not in regular projects where a linter would catch the problem.
I disagree. Developers who write classes are all at one time beginners learning to write classes too, and the mistake illustrated in the OP is actually quite commonly made by beginners learning to write classes. I think this idea can in fact help beginners.
But that isn’t really what this thread is about. This thread is about users thinking they’re calling with the right number of arguments but are getting an error about a wrong number of arguments only because they’re forgetting to declare self in the method.
Surely almost all the false positives can be eliminated by requiring there to be one spare argument and A().lol.__code__.co_varnames not starting with "self"?
I agree, though the challenge is how we can detect that we’re calling a bound method when currently the plumbing is just not there–by the time the error is being generated the context of a bound method is already gone.
One fairly reasonable heuristic I can think of to help detect a bound method call is to check if the type of the first argument has an attribute whose name is the same as the name of the code object of the function being called, and whose value is the code object of the function being called. Unless one deliberately tries to meet these conditions, the heuristic should be enough to determine that a helpful hint is warranted when the number of given arguments is one more than the number of declare parameters.
Here’s my attempt at a modification to the existing too_many_positional function in Python/ceval.c to help illustrate the heuristic:
static int
implicit_self_was_passed(PyCodeObject *co, _PyStackRef *localsplus)
{
if (co->co_argcount < 1) {
return 0;
}
PyObject *first_name = PyTuple_GET_ITEM(co->co_localsplusnames, 0);
/* if the first parameter is actually named self, it means the user
clearly knows about self so there's no need to produce the hint */
if (_PyUnicode_EqualToASCIIString(first_name, "self")) {
return 0;
}
PyObject *first = PyStackRef_AsPyObjectBorrow(localsplus[0]);
if (first == NULL) {
return 0;
}
PyObject *attr = _PyType_Lookup(Py_TYPE(first), co->co_name);
if (attr == NULL) {
return 0;
}
if (!PyFunction_Check(attr)) {
return 0;
}
return ((PyFunctionObject *)attr)->func_code == (PyObject *)co;
}
static void
too_many_positional(PyThreadState *tstate, PyCodeObject *co,
Py_ssize_t given, PyObject *defaults,
_PyStackRef *localsplus, PyObject *qualname)
{
/* ... */
const char *self_hint =
(given == co_argcount + 1 && implicit_self_was_passed(co, localsplus))
? ". Note: 'self' was passed implicitly; did you forget to declare it?"
: "";
_PyErr_Format(tstate, PyExc_TypeError,
"%U() takes %U positional argument%s but %zd%U %s given%s",
qualname,
sig,
plural ? "s" : "",
given,
kwonly_sig,
given == 1 && !kwonly_given ? "was" : "were",
self_hint);
/* ... */
}
I like Ducho + Chris’ + Brénainn error message. It feels concrete and good enough.
I feel uncomfortable with the practice of prepending the self object into method params and then not treating it specially (e.g. it doesn’t have to be named self). Why does python not enforce this convention?
I do think this can become too verbose too easily. We should focus on helping beginners who aren’t yet familiar with how Python, unlike most other programming languages, injects an implicit instance as the first argument to a call to a bound method, so the hint should be more directed at this common scenario. Other types of mismatches between call args vs. declared params can be easily spotted and understood by even beginners.
There are scenarios where you don’t want to call it self. I’ve only come across it once in the wild, but the possibility being there is one of the things I love about Python.
I can’t actually figure out where I met the possibility initially, I thought there was a construction with nested class definition but I can’t get that to work anymore. But here’s someone online who also has good reasons to define methods where the first argument isn’t called self: https://medium.com/@dungwoong/i-finally-understand-python-metaclasses-86b4f643f5ff
(When defining metaclasses, it’s common for the inplicitly injected instance name to be called mcls or cls)
One common use case for a non-self-named first parameter for an instance method (besides cls for class methods and mcs/mcls/metacls for metaclass methods) is for a wrapper function defined in a method to wrap another instance method while needing access to both the inner self and the outer self, thereby often naming the inner selfthis instead.
In the example below, the wrapper function wrap_func becomes a bound instance method once a method decorated with Deprecated is called:
I think you are raising this because the proposal explicitly checks for the name “self”, and that if someone uses a different name, then the nice error message won’t kick in. This is fine. The goal here is “beginner friendly” error messages. Beginners aren’t going to use names other than self. People writing these more involved structures will be well-equipped to decipher possibly confusing error messages.
A solution that helps 98% of people is a really good solution.
I definitely remember this being confusing as a beginner, so am generally supportive of improving the error message. On the point about it not always being called self, even in the cases where the user has chosen to call it something else, it’s perhaps still fair to refer to it as “the self argument”. Would need to consider cls for class methods too.