What is the difference between a function vs a shallow object with `__call__`?

Background

I would like to understand how functions are different from typical classes that implement __call__. They’re clearly not the same in terms of the result produced, but I am ignorant about what goes on behind the scenes.

At some level of analysis I don’t really ‘need’ to know, the stuff I write doesn’t depend on the distinction, but after coding in Python for 10 years I think I ‘want’ to know.

My rough expectation is that they’re both PyCodeObject at the C level, but beyond that I don’t know where in the code to look. I have not written a lot of C code, so I am hoping someone can explain (in C concepts is fine as I don’t mind learning) what is going on without me resorting to trying to stumble into a setup where I can run GDB to follow a rabbit trail of calls… At least, that’s what I imagine I would see without knowing what I am looking for.

I recently bought CPython Internals, so references to that book would also be useful. I also literally just found the online CPython’s internals, although I am unsure if/where my question would be addressed.

Example

I am considering the difference between

def foo():
    pass

and

class Bar:
    def __call__(self):
        pass

foo = Bar()

We can make some superficial comparisons:

>>> Bar(), foo
(<__main__.Bar object at 0x72341a7a3d40>, <function foo at 0x72341a7adf80>)
>>> dis(Bar())
>>> dis(foo)
  1           0 RESUME                   0

  2           2 RETURN_CONST             0 (None)
>>> set(dir(Bar())) - set(dir(foo))
{'__weakref__'}
>>> set(dir(foo)) - set(dir(Bar()))
{'__annotations__', '__get__', '__closure__', '__builtins__', '__name__', '__type_params__', '__qualname__', '__kwdefaults__', '__code__', '__globals__', '__defaults__'}

Question

What is going on behind the scenes when I define a function vs defining a class with __call__?

I think of function as some sort of standardised class, which has __call__ and extra attributes.

def foo():
    return 1

# foo has __call__
print(foo.__call__)    # bound method
print(foo.__call__.__self__)    # <function foo>

type(foo).__call__(foo)    # 1

So FunctionType is a lower level Functor.

While when you define your own class with __call__, then you have a Functor, which implements its __call__ method using another (lower level) Functor.

In Python everything is an object.
isinstance(obj, object) is True for any object, even object.
And everything that is callable implements __call__ method.

Now your attribute comparison shows that your own custom Functor does not have same attributes as function. Reason being is that user defined functor has those attributes attached to its __call__, which is a function.

Maybe someone can offer more complete picture starting from the very bottom layer, but above has served me well for a very long time and I never needed to go deeper than that.

I just imagine function to be a class:

class FunctionType:
    __annotations__ = ...

    def __call__(self, *args, **kwds):
        namespace = {**zip(self.argnames, args), **kwds}
        return eval(self.function_body, namespace)

Which is created via syntactic convenience def name(*args, **kwds): ....

There’s a few differences at C level.

At C level calling something uses the tp_call slot (and/or vectorcall if it exists, which should be a faster way of doing the same thing).

For the direct function type, the tp_call slot is set to something reasonably optimized that just goes straight into evaluating your Python code. And vectorcall exists and is usually used.

For the class with __call__ the class is filled in with a tp_call slot that looks up the __call__ string on the class, gets the function object for that and then calls it. So there’s an extra level of indirection. And I don’t think vectorcall is used.

1 Like

One of the differences, as @dg-pb points out, is that a function object comes with some attributes that your “class with __call__” doesn’t have.

I’ve been playing with this myself recently. I think the most significant difference is that a function has a __get__ method. This makes it a descriptor, and so it exhibits binding behaviour. This matters when it is found as an attribute of a class.

Here’s what I mean:

>>> class MyCallable:
...     def __call__(self, *args, **kwargs):
...         print(self, args, kwargs)
... 
>>> def fun(x, *args, **kwargs):
...     print(x, args, kwargs)

The difference between these two comes when you put them in a class definition:

>>> class A:
...     c = MyCallable()
...     f = fun

Now let’s make an instance and try to treat c and f as methods:

>>> a = A()
>>> a
<__main__.A object at 0x0000023778A7D790>
>>> a.c(1, 2, s=3)
<__main__.MyCallable object at 0x0000023778A7C850> (1, 2) {'s': 3}
>>> a.f(1, 2, s=3)
<__main__.A object at 0x0000023778A7D790> (1, 2) {'s': 3}
>>>

The first argument to the c-call is c itself, while the first argument to the f-call is a. This is how a function definition made in a class becomes a method, but it also applies if we make it elsewhere and insert it later. (It does not matter that I called the first argument x and not self.)

This difference is made when Python looks up the attribute on the instance a:

>>> a.c
<__main__.MyCallable object at 0x0000023778A7C850>
>>> a.f
<bound method fun of <__main__.A object at 0x0000023778A7D790>>

a.c just looks up the value and retrieves it, while a.f creates a new object in which a has been squirrelled away so it can be delivered as the first argument when Python gets to the ():

>>> a.c == A.c
True
>>> a.f == A.f
False
>>> a.f.__self__ is a
True

It’s the __get__ method of a function that does this binding, and the __getattribute__ method of object that calls it.

As @jeff5 mentions, the biggest is the descriptor protocol

1 Like