A way of monitoring the max stack depth reached

seddonym · January 1, 2025, 5:01pm

Hi everyone,

I work on a very large Python monolith and we sometimes hit RecursionErrors in production, in particular due to long chains of imports. (If you’re interested, I’ve provided an analysis in this sample repository.)

We are currently dealing with this by increasing the recursion limit. But what we don’t know is how much headroom we have at any one time.

It would be really helpful if Python provided a way of monitoring the maximum stack depth reached during a process, so we could proactively raise the recursion limit if we notice that we are getting too close.

We can get the stack depth at any time using inspect.stack(), so perhaps we could gain some insights by calling this from some other low-level function (e.g. logging). But this won’t give us the true maximum. I wonder if anyone is aware of any mechanism by which we could get this number, or has any other ideas?

barry-scott · January 1, 2025, 5:46pm

That is a surprise. You cannot refactor to fix the import recursion?

However the idea of a stack high water mark count does sound useful.

seddonym · January 1, 2025, 8:24pm

There is no recursion - it’s just a chain of imports. In a completely blank project apart from imports, Python will raise a RecursionError after 124 imports (with the default recursion limit of 1000).

You’re right, chains of this length could (and probably should) be factored out. But in a very large project like ours that is difficult to achieve (especially if we can’t measure it).

cameron · January 1, 2025, 9:11pm

I work on a very large Python monolith and we sometimes hit
RecursionErrors in production, in particular due to long chains of
imports.

I’m… impressed

We can get the stack depth at any time using inspect.stack(), so
perhaps we could gain some insights by calling this from some other
low-level function (e.g. logging). But this won’t give us the true
maximum. I wonder if anyone is aware of any mechanism by which we could
get this number, or has any other ideas?

I see you’ve found inspect.stack. Possibly you can plug into the
import machinery in importlib to check on this during the process.

Since imports are done only ince per module, can you establish what the
deepest chain of imports will be? Even something as simple as grepping
the import linesfrom the code base and inspecting the graph they
imply.

Then you could have a utility module which imports all of the heavily
used ones in one flat go, in leaf-most-first order (i.e. so that the
utility module itself doesn’t recurse).

Then early in the problematic modules, import the utility bulk-import
module first.

hauntsaninja · January 1, 2025, 10:58pm

inspect.stack can be very slow, if you need a faster alternative, see How do I get the current depth of the Python interpreter stack? - Stack Overflow

Cameron’s suggestion of plugging into import machinery is a good one. You can try something like:

class MeasureStackDepth:
    def find_spec(self, fullname, path, target=None):
        # ... record the stack depth ...
        return None

# do this before the imports you care about
sys.meta_path.insert(0, MeasureStackDepth())

delfick · January 2, 2025, 12:10am

I work on the same codebase and we actually disabled datadog Watchdog because of the way it inserts itself into the import machinery was adding a lot of extra frames!

Stefan2 · January 2, 2025, 12:48am

Is a large depth and setting the limit to something huge actually still dangerous? Years ago I had crashes around depth 6000 on my PC or 30000 elsewhere, but since CPython 3.11, that seems to have become obsolete…

hauntsaninja · January 2, 2025, 12:51am

If you mean dangerous in the sense of crashing the Python interpreter — it is now much, much harder to crash the Python interpreter via deep stacks / recursion. See Crashes from recursion in CPython (in particular, towards the end) for a full rundown of changes in recent Python versions.

Stefan2 · January 2, 2025, 1:31am

Yes, your “The brighter future” section is exactly what I meant. Setting a huge limit and then deep recursion crashing badly, now apparently avoided with those changes in 3.11 and 3.12. I’m still not sure about it, whether all cases are covered. What do you think about David’s situation? Apparently they’re trying to increase the limit “just enough”, but does that still make sense? Or could they just set a huge limit without risk? (Assuming they use CPython 3.12, which their demo project seems to suggest they do )

blhsing · January 2, 2025, 1:41am

Unless you have a module that dynamically builds a call tree of an indefinite depth (likely from user-supplied data sources) you should be OK with simply increasing the recursion limit to a larger fixed number that fits the need of your project.

If you do want to monitor the stack depth and possibly increase the recursion limit on demand, you can use a profiler to keep track of calls and returns:

import sys

class indefinite_recursion_limit:
    def __enter__(self):
        self.level = 3 # the interpreter pre-occupies the top 2 frames
        frame = sys._getframe(1)
        while frame := frame.f_back:
            self.level += 1
        self.limit = self.orig_limit = sys.getrecursionlimit()
        self.orig_profile = sys.getprofile()
        sys.setprofile(self._ensure_recursion_safety)

    def __exit__(self, exc_type, exc_value, exc_tb):
        sys.setrecursionlimit(self.orig_limit)
        sys.setprofile(self.orig_profile)

    def _ensure_recursion_safety(self, frame, event, arg):
        if event == 'call':
            self.level += 1
            if self.level == self.limit:
                self.limit = int(self.limit * 1.2) # increase by 20%
                print(f'recursion limit increased to {self.limit}')
                sys.setrecursionlimit(self.limit)
        elif event == 'return':
            self.level -= 1

so that:

def f(n):
    if n == 0:
        return 0
    return n + f(n - 1)

with indefinite_recursion_limit():
    print(f(10000))

outputs:

recursion limit increased to 1200
recursion limit increased to 1440
recursion limit increased to 1728
recursion limit increased to 2073
recursion limit increased to 2487
recursion limit increased to 2984
recursion limit increased to 3580
recursion limit increased to 4296
recursion limit increased to 5155
recursion limit increased to 6186
recursion limit increased to 7423
recursion limit increased to 8907
recursion limit increased to 10688
50005000

Demo here

storchaka · January 2, 2025, 11:01am

Recursion of Python code is now practically unlimited. sys.setrecursionlimit() is only used to guard agains infinite recursion caused by programming error. But if the call includes C frames, they consume the limited C stack, and the following Python frames need a new portion of the C stack as well. Even if the import machinery is mainly implemented in Python, import passes through several C frames (for import/__import__, for exec(), maybe more), so it cannot be infinitely recursive. Deep enough recursion will crash.

Create a set of files module{i}.py with content print({i}); import module{i+1}, import the first one, and see how far will it go before it crashes.

Lucas_Malor · January 2, 2025, 1:16pm

Maybe I’m OT, but what about rethinking about tail call optimization?

seddonym · January 2, 2025, 2:40pm

Create a set of files module{i}.py with content print({i}); import module{i+1} , import the first one, and see how far will it go before it crashes.

Good idea. I used the command in the demo repository to create a chain of several thousand modules. It appears that the chain length is limited to 1,249 regardless of how high I set the recursion limit. A recursion limit of around 10,000 seems to be the highest it can go before it stops having any effect.

  File ".../import-recursion-demo/src/demo/mod_1249.py", line 1, in <module>
    from . import mod_1250
  File "<frozen importlib._bootstrap>", line 1357, in _find_and_load
RecursionError: maximum recursion depth exceeded

It crashes with a RecursionError rather than anything else.

seddonym · January 2, 2025, 2:43pm

Thanks for taking the time to write this up. But surely it is better just to set a very high recursion limit rather than dynamically growing it?

seddonym · January 2, 2025, 2:55pm

Since imports are done only ince per module, can you establish what the
deepest chain of imports will be?

The trouble is, many of our imports are dynamic (e.g. different modules are imported based on environment variables) and are sometimes triggered at runtime (e.g. while serving an HTTP request). So it would be very difficult to determine the max depth through static analysis. That said, we might be able to get some insight through doing that.

Then you could have a utility module which imports all of the heavily
used ones in one flat go, in leaf-most-first order (i.e. so that the
utility module itself doesn’t recurse).

Oh that is an interesting idea…I’ll do an experiment to see if that allows a deeper chain.

seddonym · January 2, 2025, 3:06pm

You’re right! Changing the order of imports does indeed prevent the recursion error. I was able to import a chain of 10,000 modules provided I imported them in reverse order.

For the reasons I mention above, I suspect it would be non-trivial to determine which modules should be imported in this way, but it’s a useful option to consider if we do start to hit hard limits.

Rosuav · January 2, 2025, 6:05pm

Can you generate an upper bound by saying “yes” to every conditional?

seddonym · January 3, 2025, 8:19am

It’s more complicated than that - we run our application under many different configurations with a variety of interchangeable plugins. And I have a feeling that the max depth might be affected by the order in which HTTP requests are served.

That said, we probably could statically determine an optimal way of pre-importing all the modules during bootstrap, in such a way that we minimize the length of import chains.

seddonym · January 3, 2025, 8:23am

@storchaka out of interest do you think this is a bug with CPython? First, it seems wrong to raise a RecursionError when there is no recursion, and second it seems strange that there is a hard limit for this when there isn’t for genuine recursion.

Rosuav · January 3, 2025, 8:31am

Oof, that doesn’t make it easy. Oh well.

That’d help! It doesn’t even have to be perfect. If you pre-import most, but not all, of the modules you need, it’ll still help enormously. Conversely, if you pre-import one or two that you don’t need, the cost is fairly small (at least, assuming the modules don’t have vital side effects to them?).