Garbage collection of function context when nested function is returned

let’s say we have

def a():
    massive_local_object=...
    small_local_object=...
    return lambda:small_local_object

since python is dynamically evaluated, i can’t imagine the garbage collector being smart enough to be able to gc massive_local_object before the returned lambda is not referenced anymore, that is, the local function context does not have to be sustained any longer.

so if i had to return only a few of the locals in a function context, it would probably be better to do so with regular data structures then functions that utilize the local function context, so that unneeded local objects can be gc?

1 Like

Well, actually, it is! :slight_smile: There’s a difference between “fast locals” and “cell variables” in CPython; anything declared nonlocal and anything referenced by a nested function is put into a closure cell rather than a fast local.

>>> dis.dis(a)
None           0 MAKE_CELL                1 (small_local_object)

   1           2 RESUME                   0

   2           4 LOAD_CONST               1 (Ellipsis)
               6 STORE_FAST               0 (massive_local_object)

   3           8 LOAD_CONST               1 (Ellipsis)
              10 STORE_DEREF              1 (small_local_object)

   4          12 LOAD_FAST                1 (small_local_object)
              14 BUILD_TUPLE              1
              16 LOAD_CONST               2 (<code object <lambda> at 0x7f37a7195070, file "<stdin>", line 4>)
              18 MAKE_FUNCTION
              20 SET_FUNCTION_ATTRIBUTE   8 (closure)
              22 RETURN_VALUE

The one that isn’t needed past the immediate function return is a fast local, so it’ll be disposed of; the one that’s needed for the lambda function is a closure cell, set with STORE_DEREF.

3 Likes

Test it to see what happens:

>>> class Foo:
...     def __init__(self, name):
...         self.name = name
...     def __del__(self):
...         print(f'Foo.__del__ called for {self.name}')
...
>>> def a():
...     massive_local_object = Foo('massive')
...     small_local_object = Foo('small')
...     return lambda: small_local_object
...
>>> result = a()
Foo.__del__ called for massive
4 Likes

oh wow, that’s cool. right, in hindsight, it also makes sense that the compiler is checking that, as it has to to detect unbound nonlocal declarations

ye i suppose the interactive shell has always been quite fast with gc’ing, worst case you throw in a big numpy allocation and it starts gc’ing (eVeN ThOuGh iT Is nOt gUaRaNtEeD)

The GC code is only needed to detect object cycles.

In the normal case the ref count goes to 0 and the object is deleted without the need to run the GC.

If you really need to trigger a GC, you can use

import gc
gc.gc()

As @barry-scott explained, this is normally not needed.