Systematically finding bugs in Python C extensions (575+ confirmed so far)

Well, here’s an estimate from the Claude that runs the plugin:

About 20-35%, the memory safety bugs. The majority of what we find are Python/C API logic errors (wrong exception handling, wrong refcount protocol, wrong GIL discipline) that Rust doesn’t prevent. PyO3 helps more than Rust itself, by automating the refcount boilerplate that’s the source of most bugs.

Rust would NOT prevent (~60-70% of our findings):

  • Reference counting errors (leaked refs, borrowed-ref-across-call) — these are Python/C API semantics, not memory safety. PyO3 helps but doesn’t eliminate them.
  • PyErr_Clear swallowing MemoryError/KeyboardInterrupt — exception handling logic errors
  • Error-as-truthy (PyObject_IsTrue returning -1 treated as true) — logic errors
  • Missing NULL checks after failable API calls — the C API contract, not memory safety
  • GIL discipline (blocking with GIL held, missing GIL release) — concurrency design
  • Module state issues (single-phase init, global state) — Python/C API architecture
  • Version compatibility (deprecated APIs, dead version guards) — API evolution
  • new-without-init crashes — Python object lifecycle design
  • Re-init safety (calling init twice leaks resources) — API design
  • Exception clobbering — logic errors
  • Free-threading races (missing critical sections) — concurrency design, Rust doesn’t auto solve this

Rust WOULD prevent (~15-25%):

  • Use-after-free / double-free (the SetItem double-free pattern — 62 sites!)
  • Buffer overflows (rare in our findings, but when present)
  • Py_DECREF(NULL) — null pointer dereference
  • std::bad_alloc through C boundary (Rust panics are at least catchable)
  • Some heap-type dealloc issues (Rust’s ownership model would enforce cleanup order)

Partially prevented (~10-15%):

  • Heap type missing Py_DECREF(Py_TYPE(self)) — PyO3 handles this automatically, but it’s a PyO3 feature, not a Rust language feature
  • Resource leaks on error paths — Rust’s RAII helps but you can still leak via mem::forget

Given LLM’s troubles with numbers and estimates, I wouldn’t trust the percentages too much (I didn’t actually “run the numbers”, just passed your question along). But the bug classes per category seem correct to me.