Thread-safe way of disabling the GC (in C, mainly)

I suspect the answer is “you can’t have this” but it’s always worth starting the discussion anyway.

Background

Cython’s decided to expose the free-threading critical section as a decorator/context-manager, so users can do things like:

with critical_section(some_object):
    some_object.some_c_attribute += 1

I was trying to document a little more what type of guarantees this gives, especially since Cython users are a little more abstracted than C users from exactly what code it generates. And also what that means for a “GIL build” of Python, where the critical section would be a no-op and they’d just be protected by the GIL.

If you’re manipulating Python objects within that block, one big weakness is that finalizers can run arbitrary code and that can both interrupt the critical section and release the GIL (with the critical section being a little safer, since it doesn’t get released unless the arbitrary code accesses some_object).

It’s generally possible to avoid finalizers by being careful about where you delete objects. Except for the GC which can also run as a result of almost any allocation (and it’s usually possible to delay destruction, but not allocation).

What I’d like to have done

I then thought that it would be nice to be able to write something like:

with block_gc():
    ...  # small thread-safe code block

(where Cython can transform block_gc into direct C API calls). That way you could be confident that the GC doesn’t run arbitrary code within your block.

This doesn’t work though, because the GC is controlled by a single flag on the interpreter state, so changing it from multiple threads at once will end up fighting.

Caveat

This is mostly driven by me thinking “what facilities would we like to provide”. However nobody’s actually asked for it yet so it might well be something that seems nice to me but that nobody would use in the real world.

3 Likes

So, PyGC_Disable/PyGC_Ensable should probably grow variants that count the “number of disable requests”, and we should deprecate the “return previous value” style that forces users to rely on the GIL?


As far as I understand it, this is a dangerous misconception.
A critical section has a per-object lock; but other than that it should give you similar (surprisingly weak) guarantees as the GIL. Crucially, APIs that traditionally release the GIL, like Py_BEGIN_ALLOW_THREADS, will release critical section locks – even in arbitrary code that doesn’t know anything about the locked object.
So, yes, CSs are the wrong tool for a “block” decorator.
The docs have the full story. If you can find a wording to make things clearer, please send a PR.

Yes - that’s what I was thinking of (assuming that people agree that this is a good idea…)

In some ways they’re no worse than with gil: which Cython also provides - it’s OK provided you know what it’s doing.

We’ve documented it as “top-level code will run with the lock held on this object, but if you do Python stuff you shouldn’t assume the whole section will run atomically”.

My intention is that it’d mainly be suitable for small blocks dealing with C attributes of extension types and I’m just trying to document how far you can stretch this.

Good point - this is something I should also document.

1 Like

If I understand correctly, this is about Cython code that is going to be compiled to C, so even though it looks like Python syntax, it’s not bytecode executed by the Python interpreter. In that case, I think with critical_section(...) can make sense as nicer syntax for the exiting critical section C API.

To be clear, with critical_section(...) would not make sense for Python code because nearly every expression or bytecode can implicitly, temporarily release the critical section in free threading builds (or the GIL in GIL-enabled builds).

Except for the GC which can also run as a result of almost any allocation…

That’s not the case anymore, since Python 3.113.12, so it’s not a concern for any of the free threading builds.

I don’t think you want to disable GC here. Like you say, you need to be careful about the calls you make inside the critical section. I expect that’s a bit trickier in Cython than when writing C code directly because a lot of Py_DECREF() calls will be implicitly generated by Cython that you’d have to explicitly write in C.

1 Like

Yes you understand correctly. And I agree that wouldn’t make sense as a Python-level decorator.

I didn’t know that - if that’s the case then there’s probably less need to be able to disable the GC as I described. Out of interest, what does trigger the GC?

True - although not impossible. That might be something that we should look at in Cython, to defer Py_DECREF until after the critical section (although obviously we could only do that for calls that we can directly see).

GC allocations increment a counter and deallocations decrement that same counter. In 3.10 3.11 and earlier, when that counter exceeds some threshold, the GC is run immediately during that GC allocation. In 3.11 3.12 and later, the GC is instead scheduled to run sometimes soon by setting a bit in the eval breaker. The GC is then run when the Python bytecode interpreter checks the eval breaker.

Yeah, I don’t mean to suggest that’s it’s impossible. Like you said earlier, it’s a lot like the with gil: blocks.

EDIT: I messed up the Python versions. The change in GC behavior started in 3.12.

3 Likes

Thanks both.

I think based on the explanation of when the GC runs, I agree that I don’t have a great need for this feature.

I still think it’d be nice to be able to switch the GC on and off in a thread-safe manner, but I don’t think anyone should do it on my behalf given the lack of real use-case.

(I did also find an old issue asking for a GC context manager, although that was in terms of “should we provide it” rather than “should we change how it works”.)

You might be looking for a way to disable the eval breaker, since Ctrl+C can have even bigger effects than a GC run.
But I don’t think we want to allow that kind of control over whatever’s downstream in the call graph.

I mean, this sounds like a reasonable thing to do if you’re embedding Python (if you’re “just a library” then obviously not). More reasonable than disabling the GC, anyway. Most embedding scenarios I’ve encountered would rather use their own mechanism to break in besides Ctrl+C.