Support for Cross-Language Garbage Collection

After year of work attempting to come up with ways to unify the two conflicting garbage collection systems for Python and Java. This approach may work for other systems such as C# in which the foreign language provides GC but no native support for a distributed system. The demonstration exploits features of the Python garbage collection system to provide the required information for managing leases and resolving reference loops.

Perhaps this is something we can work into a PEP?

/* 
 * Cross-Language Garbage Collection (CLGC) Demonstration Module
 *
 * This module addresses the challenges of garbage collection across languages,
 * specifically between Python and Java, by introducing hooks into Python's
 * generation 2 garbage collector. It demonstrates how Python can efficiently
 * manage cross-language references through a process called "internalization."
 *
 * The implementation assumes a "double weak" referencing system, where each
 * language holds its own strong references to its objects. Weak references are
 * used to communicate when foreign objects are no longer needed, allowing them
 * to be dropped. Internalization further enhances this by replacing strong
 * references with Python-side graph portions, enabling normal garbage
 * collection.
 *
 * Internalization Process:
 * ========================
 * Internalization is the process by which Python identifies isolated objects
 * connected to foreign systems, delegates ownership of their lifecycle to the
 * foreign system, and ensures proper cleanup of cross-language references. This
 * ensures Python no longer holds responsibility for keeping foreign objects
 * alive, allowing the foreign system to manage their lifecycle efficiently.
 *
 * Mechanics:
 * ----------
 * 1. **Detection of Isolated Python Objects:** 
 *    - During Python's garbage collection traversal, the `ReferenceManager`
 *      identifies Python objects that are not connected to other Python objects
 *      except through the `ReferenceManager`.
 *    - These objects are considered "isolated" from Python's perspective and
 *      are candidates for internalization.
 *
 * 2. **Depth-First Search (DFS):**
 *    - The `ReferenceManager` uses Python's traversal system to perform a
 *      depth-first search (DFS) on the reference graph rooted at the isolated
 *      Python object.
 *    - The DFS explores all references originating from the Python object to
 *      determine its relationship with foreign objects (e.g., Java objects).
 *
 * 3. **Discovery of Java References:**
 *    - During the traversal, if the `ReferenceManager` encounters a reference
 *      to a Java object:
 *      - It checks whether the Java object has already been visited by Java's
 *        garbage collector during its cycle.
 *      - If the Java object **has not been visited**, it means the Java object
 *        is still reachable from Python but not actively held by Java.
 *
 * 4. **Delegating Ownership to Java:**
 *    - If an unvisited Java object is discovered during the DFS:
 *      - The relationship between the Python object and the Java object is sent
 *        to the Java side.
 *      - On the Java side:
 *        - The Java object is removed from the **strong global reference list**,
 *          meaning Java no longer holds the object alive directly.
 *        - The Java object is now held alive only by the weak reference
 *          originating from Python.
 *
 * 5. **Handling Cases with No Java References:**
 *    - If the depth-first search does not discover any Java references:
 *      - The Python object is added to the `foreign_list` in the
 *        `ReferenceManager`.
 *      - The Python object will remain on the `foreign_list` until Java breaks
 *        its weak link to the object.
 *      - Since no Java references were found, there cannot be a cross-language
 *        reference loop involving this Python object.
 *
 * Key Scenarios:
 * --------------
 * - **Scenario 1:** Python Object References a Java Object
 *   - The Python object is isolated except for its reference to the Java object.
 *   - The DFS discovers the Java object, which has not been visited by Java's
 *     garbage collector.
 *   - The relationship is delegated to Java, and the Java object is removed
 *     from the strong global reference list.
 *   - **Outcome:** Python no longer holds responsibility for the Java object,
 *     and the Java object is managed by Java.
 *
 * - **Scenario 2:** Python Object Does Not Reference Any Java Object
 *   - The Python object is isolated.
 *   - The DFS does not discover any Java references.
 *   - The Python object is added to the `foreign_list` and waits for Java to
 *     break its weak link.
 *   - **Outcome:** The Python object remains on the `foreign_list` until Java
 *     determines it is no longer needed.
 *
 * - **Scenario 3:** Cross-Language Reference Loop
 *   - The DFS discovers a loop involving both Python and Java objects.
 *   - The relationship is delegated to Java, and Java assumes ownership of the
 *     loop.
 *   - **Outcome:** Python no longer holds responsibility for the loop, and Java
 *     manages its lifecycle.
 *
 * Advantages:
 * -----------
 * - **Efficient Garbage Collection:** Delegating ownership of cross-language
 *   relationships to the foreign system reduces Python's involvement in
 *   managing foreign objects.
 * - **Breaks Reference Loops:** Internalization ensures that reference loops
 *   involving both Python and Java are broken, preventing memory leaks.
 * - **Optimized Resource Management:** Objects are cleaned up promptly when no
 *   references exist on either side.
 *
 * Edge Cases:
 * -----------
 * - **Case 1:** Java Object Referenced by Multiple Python Objects
 *   - If multiple Python objects reference the same Java object, the Java
 *     object will remain alive until all Python references are removed.
 * - **Case 2:** Weak Links
 *   - If the Java object is held alive only by a weak link from Python, it will
 *     be garbage collected by Java once Python removes its reference.
 * - **Case 3:** Synchronization Delays
 *   - If Python and Java garbage collection cycles are not synchronized, there
 *     may be slight delays in cleanup. This is not critical but could be
 *     optimized.
 *
 * Python GC Phases:
 * -----------------
 * - **Phase 1: subtract_refs**
 *   - Decreases reference counts for objects in the collection set. If an
 *     object's reference count drops to zero, it is considered unreachable.
 *   - The `arg` parameter in the traversal callback (`visit_decref`) represents
 *     the parent object being traversed.
 *
 * - **Phase 2: move_reachable**
 *   - Identifies all reachable objects starting from the roots and moves them,
 *     along with their dependencies, into the reachable list.
 *   - The `arg` parameter in this phase typically represents the new list of
 *     reachable items (`PyGC_Head*`).
 *
 * - **Phase 3 (Debugging Enabled):**
 *   - Exploits properties of Python's GC to pivot the Sentinel object to the
 *     back of the GC list. All objects afterward must be owned by a foreign
 *     object or collected. At this point, the structure and links between
 *     objects can be analyzed.
 *
 * Implementation Notes:
 * ---------------------
 * - Internalization is only possible on the Python side because Java lacks a
 *   traversal mechanism to discover relationships.
 * - Proper synchronization between Python and Java garbage collection processes
 *   is essential for efficient cleanup.
 * - Ensure the `is_broken` function is implemented correctly and efficiently,
 *   as it plays a critical role in determining the lifecycle of objects.
 *
 * Future Considerations:
 * ----------------------
 * - A future PEP could define the minimal API required to support CLGC:
 *   1. A method to report the type of GC being executed at the start of the
 *      cycle.
 *   2. Callbacks at each major phase to allow state alterations.
 *   3. Methods to check object states during each phase, accounting for Python
 *      objects potentially existing in multiple states within the same phase.
 *
 * This module simulates interactions with Java objects and demonstrates how
 * Python's GC can be extended for cross-language garbage collection.
 */
// Define Py_BUILD_CORE to access internal headers
#define Py_BUILD_CORE

#include <Python.h>
#include <frameobject.h>
#include <internal/pycore_gc.h>      // Internal header for GC structures
#include <internal/pycore_interp.h>  // Internal header for interpreter state

// Macros for accessing GC internals
#define GC_NEXT _PyGCHead_NEXT
#define GC_PREV _PyGCHead_PREV
#define GEN_HEAD(gcstate, n) (&(gcstate)->generations[n].head)
#define PREV_MASK_COLLECTING _PyGC_PREV_MASK_COLLECTING
#define NEXT_MASK_UNREACHABLE (1)
#define AS_GC(o) ((PyGC_Head *)(o)-1)

/* 
 * Utility functions for GC operations 
 */

// Check if a GC object is currently being collected
static inline int gc_is_collecting(PyGC_Head *g) {
    return (g->_gc_prev & PREV_MASK_COLLECTING) != 0;
}

// Get the reference count of a GC object
static inline Py_ssize_t gc_get_refs(PyGC_Head *g) {
    return (Py_ssize_t)(g->_gc_prev >> _PyGC_PREV_SHIFT);
}

// Append a node to a GC list
static inline void gc_list_append(PyGC_Head *node, PyGC_Head *list) {
    PyGC_Head *last = (PyGC_Head *)list->_gc_prev;
    _PyGCHead_SET_PREV(node, last);
    _PyGCHead_SET_NEXT(last, node);
    _PyGCHead_SET_NEXT(node, list);
    list->_gc_prev = (uintptr_t)node;
}

// Remove a node from its current GC list
static inline void gc_list_remove(PyGC_Head *node) {
    PyGC_Head *prev = GC_PREV(node);
    PyGC_Head *next = GC_NEXT(node);
    _PyGCHead_SET_NEXT(prev, next);
    _PyGCHead_SET_PREV(next, prev);
    node->_gc_next = 0;  /* Object is not currently tracked */
}

// Check if an object is garbage-collectable
static inline int _PyObject_IS_GC(PyObject *obj) {
    return (PyType_IS_GC(Py_TYPE(obj)) && 
            (Py_TYPE(obj)->tp_is_gc == NULL || Py_TYPE(obj)->tp_is_gc(obj)));
}

typedef struct _gc_runtime_state PyGCState;
static PyGCState* get_gc_state()
{
    // Access the interpreter state
    PyInterpreterState* interp = PyInterpreterState_Get();
    if (!interp) {
        PyErr_SetString(PyExc_RuntimeError, "Failed to get interpreter state");
        return NULL;
    }

    // Access the garbage collector state
    PyGCState* gc_state = &interp->gc;
    if (!gc_state) {
        PyErr_SetString(PyExc_RuntimeError, "Failed to get garbage collector state");
        return NULL;
    }
    return gc_state;
}
/* 
 * ForeignReference Structure
 *
 * Represents a reference to a foreign object (e.g., Java object) held by Python.
 * This structure is used to manage cross-language references.
 */
typedef struct ForeignReference {
    struct ForeignReference* previous;  // Previous reference in the list
    struct ForeignReference* next;      // Next reference in the list
    PyObject* local_object;             // Pointer to the local Python object (strong reference)
    void* remote_object;                // Pointer to the remote object (generic type)
} ForeignReference;

/* Static variables for CLGC state */
static int generation = -1;                 // Current GC generation
static int phase = -1;                      // Current GC phase
static PyObject* sentinel_instance = NULL;  // Sentinel object for GC tracking
static ForeignReference references;         // List of foreign references
static void* magic;                         // Magic identifier for private classes
static PyObject* gc_event_callback_function = NULL;  // GC event callback function

/* 
 * Initialize the foreign reference list.
 * The list is circular and starts with a dummy head node.
 */
static void reference_init(ForeignReference* head) {
    head->next = head;
    head->previous = head;
}

/* 
 * Insert a new item into the foreign reference list.
 */
static void reference_insert(ForeignReference *head, ForeignReference* item) {
    ForeignReference* next = head->next;
    head->next = item;
    item->next = next;
    item->previous = head;
    next->previous = item;
}

/* 
 * Remove an item from the foreign reference list.
 */
static void reference_remove(ForeignReference* item) {
    if (item->previous == NULL || item->next == NULL) return;
    ForeignReference* prev = item->previous;
    ForeignReference* next = item->next;
    prev->next = next;
    next->previous = prev;
    item->previous = NULL;
    item->next = NULL;
}

/* 
 * Visit all references in the foreign reference list.
 * This is used during GC traversal to ensure proper cleanup.
 */
static int visit_references(visitproc visit, void* arg) {
    ForeignReference* current = references.next;
    while (current != &references) {
        PyObject* op = current->local_object;
        Py_VISIT(op);
        current = current->next;
    }
    return 0;
}

/* 
 * ForeignObject Type
 *
 * Represents a foreign object held by Python. This type is used to simulate
 * interactions with Java objects in the CLGC process.
 */
typedef struct {
    PyObject_HEAD
} ForeignObject;

/* 
 * Free function for ForeignObject.
 * Ensures proper cleanup of foreign objects during GC.
 */
static void Foreign_free(void* obj) {
    PyTypeObject *type = Py_TYPE(obj);
    if (type->tp_flags & Py_TPFLAGS_HAVE_GC)
        PyObject_GC_Del(obj);
    else
        PyObject_Free(obj);
}

/* 
 * Renew the lease for a foreign object.
 * This is called when the foreign object is still reachable from Python.
 */
static void foreign_renew(ForeignObject* self) {
    printf("renew %p\n", self);
}

/* 
 * Skip renewal for a foreign object.
 * This is called when the foreign object is no longer reachable from Python.
 */
static void foreign_skip(ForeignObject* self) {
    printf("skip %p\n", self);
}

/* 
 * Traverse function for ForeignObject.
 * Handles GC traversal for foreign objects during specific GC phases.
 */
static int Foreign_traverse(ForeignObject* self, visitproc visit, void* arg) {
    if (phase < 1) return 0;
    if (phase == 1) foreign_renew(self);
    if (phase == 2) foreign_skip(self);
    return 0;
}

/* 
 * ForeignObject type definition.
 */
static PyTypeObject ForeignType = {
    PyVarObject_HEAD_INIT(NULL, 0)
    .tp_name = "helloworld.Foreign",
    .tp_basicsize = sizeof(ForeignObject),
    .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,
    .tp_free = Foreign_free,
    .tp_traverse = (traverseproc) Foreign_traverse,
    .tp_new = PyType_GenericNew,
};


/* 
 * Internalization Functions
 *
 * These functions handle the process of internalization, where Python identifies isolated
 * objects connected to foreign systems and delegates ownership of their lifecycle to the foreign system.
 */

/* 
 * Start internalization for a Python object.
 * This function is a stub for interacting with Java during the internalization process.
 */
static void internalize_start(PyObject *obj, void* arg) {
    printf(" %p ::", obj);
}

/* 
 * Add a Python object to the internalization process.
 * This function is a stub for interacting with Java during the internalization process.
 */
static void internalize_add(PyObject* obj, void* arg) {
    printf(" %p", obj);
}

/* 
 * End internalization for a Python object.
 * This function is a stub for interacting with Java during the internalization process.
 */
static void internalize_end(PyObject* obj, void* arg) {
    printf("\n");
}

/* 
 * Perform a depth-first search (DFS) using Python's traversal mechanism.
 * This function analyzes relationships between foreign incoming references and foreign outgoing ones.
 */
static int internalize_trace(PyObject *op, void *arg) {
    // Skip non-GC objects
    if (!_PyObject_IS_GC(op)) return 0;

    // Skip objects not being collected
    PyGC_Head *gc = AS_GC(op);
    if (!gc_is_collecting(gc)) return 0;

    // Skip objects on younger GC lists
    const Py_ssize_t gc_refs = gc_get_refs(gc);
    if (gc_refs == 1) return 0;

    // Check if the object is a foreign reference
    PyTypeObject* tp = Py_TYPE(op);
    if (tp->tp_free == Foreign_free) {
        internalize_add(op, arg);
        return 0;
    }

    // Perform DFS traversal for objects bound for garbage collection
    gc->_gc_prev ^= PREV_MASK_COLLECTING;
    tp->tp_traverse(op, internalize_trace, arg);
    gc->_gc_prev |= PREV_MASK_COLLECTING;
    return 0;
}

/* 
 * Analyze linkages between foreign references and Python objects.
 * This function is called at the end of the GC process to discover reference loops.
 */
static void internalize_analyze() {
    ForeignReference* current = references.next;
    while (current != &references) {
        PyObject* op = current->local_object;

        // Perform DFS traversal to find reference loops
        internalize_start(op, NULL);
        internalize_trace(op, NULL);
        internalize_end(op, NULL);
        current = current->next;
    }
}

/* 
 * Sentinel Object
 *
 * The Sentinel object is used to track the GC process and perform specific actions during
 * different phases of garbage collection. It acts as a pivot point for GC operations.
 */

/* 
 * Assert that the Sentinel class is private.
 * This prevents unauthorized creation of Sentinel objects.
 */
static int assert_private(void* args) {
    if (args != magic) {
        PyErr_SetString(PyExc_TypeError, "This class is private");
        return -1;
    }
    return 0;
}

/* 
 * Pivot Object
 *
 * The Pivot object is used internally by the Sentinel to manipulate the GC process.
 */
typedef struct {
    PyObject_HEAD
    void *sentinel;  // Pointer to the Sentinel object
} PivotObject;

/* 
 * Traverse function for PivotObject.
 * Handles GC traversal for Pivot objects during specific GC phases.
 */
static int Pivot_traverse(PivotObject* self, visitproc visit, void* arg) {
    if (phase <= 0) return 0;

    if (phase == 1) {
        PyGC_Head* reachable = (PyGC_Head*) arg;

        // Move the Sentinel object to the end of the GC list
        PyGC_Head* gc2 = AS_GC(self->sentinel);
        gc_list_remove(gc2);
        gc_list_append(gc2, reachable);
        gc2->_gc_prev = 6;
    }

    return 0;
}

/* 
 * Initialize the Pivot object.
 */
static int Pivot_init(PyObject *self, PyObject *args, PyObject *kwargs) {
    return assert_private(args);
}

/* 
 * PivotObject type definition.
 */
static PyTypeObject PivotType = {
    PyVarObject_HEAD_INIT(NULL, 0)
    .tp_name = "helloworld.Pivot",
    .tp_basicsize = sizeof(PivotObject),
    .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,
    .tp_traverse = (traverseproc) Pivot_traverse,
    .tp_new = PyType_GenericNew,
    .tp_init = Pivot_init,
};

/* 
 * Sentinel Object
 */
typedef struct {
    PyObject_HEAD
    PivotObject* pivot;  // Pointer to the Pivot object
} SentinelObject;

/* 
 * Initialize the Sentinel object.
 */
static int Sentinel_init(PyObject *self, PyObject *args, PyObject *kwargs) {
    if (assert_private(args) == -1) return -1;

    SentinelObject *sentinel = (SentinelObject *)self;

    // Create a new PivotObject instance
    PyObject *pivot_instance = PyObject_CallObject((PyObject *)&PivotType, args);
    if (!pivot_instance) {
        PyErr_SetString(PyExc_RuntimeError, "Failed to create PivotObject");
        return -1;
    }

    // Link the Sentinel and Pivot objects
    sentinel->pivot = (PivotObject *)pivot_instance;
    sentinel->pivot->sentinel = self;

    return 0;
}

/* 
 * Traverse function for SentinelObject.
 * Handles GC traversal for Sentinel objects during specific GC phases.
 */
static int Sentinel_traverse(SentinelObject* self, visitproc visit, void* arg) {
    if (phase < 0 || generation != 2) {
        Py_VISIT(self->pivot);
        visit_references(visit, arg);
        return 0;
    }

    PyGC_Head *gc = AS_GC(self);

    if (phase == 0) {
        Py_VISIT(self->pivot);
        visit_references(visit, arg);
        phase++;
        return 0;
    }

    // Check if we are at the end of the reachable list
    PyGC_Head* reachable = arg;

    if (phase == 1 && GC_PREV(reachable) != gc) {
        if (gc_is_collecting(AS_GC(self->pivot))) {
            Py_VISIT(self->pivot);
            return 0;
        }

        // Move the pivot point to the back of the list
        //   This would be in trouble if pivot was the node before us, but that isn't possible unless we were already last
        PyGC_Head* gc2 = AS_GC(self->pivot);
        gc_list_remove(gc2);
        gc_list_append(gc2, reachable);
        gc2->_gc_prev = 6;
        return 0;
    }

    if (phase == 1) {
        printf("I am last\n");
        internalize_analyze();

        // Foreign objects now see phase 2, meaning they won't renew their least
        phase++;
        visit_references(visit, arg);

        if (gc_is_collecting(AS_GC(self->pivot)))
            Py_VISIT(self->pivot);
    }

    if (phase == 2) {
        Py_VISIT(self->pivot);
    }

    return 0;
}

/* 
 * SentinelObject type definition.
 */
static PyTypeObject SentinelType = {
    PyVarObject_HEAD_INIT(NULL, 0)
    .tp_name = "helloworld.Sentinel",
    .tp_basicsize = sizeof(SentinelObject),
    .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,
    .tp_traverse = (traverseproc) Sentinel_traverse,
    .tp_new = PyType_GenericNew,
    .tp_init = Sentinel_init,
};

/* 
 * GC Monitoring Functions
 *
 * These functions enable and disable monitoring of Python's garbage collection process.
 * They use Python's `gc.callbacks` mechanism to hook into GC events.
 */

/* 
 * Callback function for GC events.
 * This function is called during GC cycles to track the current generation and phase.
 */
static PyObject* gc_event_callback(PyObject* self, PyObject* args) {
    const char* event;  // Event type ("start" or "stop")
    generation = -1;    // Reset generation
    PyGCState* gc_state = get_gc_state();
    if (!gc_state) return NULL;

    PyObject* details;  // Details dictionary

    // Parse the arguments: a string (event) and a dictionary (details)
    if (!PyArg_ParseTuple(args, "sO", &event, &details)) {
        return NULL;  // Return NULL on parsing failure
    }

    if (!PyDict_Check(details)) {
        PyErr_SetString(PyExc_TypeError, "Details argument must be a dictionary");
        return NULL;
    }

    // Extract the "generation" value from the details dictionary
    PyObject* generation_obj = PyDict_GetItemString(details, "generation");
    if (generation_obj != NULL) {
        generation = PyLong_AsLong(generation_obj);
    }

    // Print messages based on the event type
    if (strcmp(event, "start") == 0) {
        phase = 0;
        printf("GC cycle started for generation %d\n", generation);
    } else if (strcmp(event, "stop") == 0) {
        printf("GC cycle ended for generation %d\n", generation);
        phase = -1;

        // At this point, we tell Java to drop the old leases and start with the new ones.
    } else {
        PyErr_SetString(PyExc_ValueError, "Invalid event type. Must be 'start' or 'stop'");
        return NULL;
    }

    // Special handling for generation 2
    if (phase == 0 && generation == 2) {
        PyGC_Head* gc = AS_GC(sentinel_instance);
        gc_list_remove(gc);
        gc_list_append(gc, GEN_HEAD(gc_state, 1));
    }

    Py_RETURN_NONE;  // Return None to indicate successful execution
}

// Global variable to store the callback function
static PyMethodDef my_method_def = {
    "callback",                   // Name of the method
    gc_event_callback,            // Function pointer to the callback implementation
    METH_VARARGS,                 // Method accepts a variable number of arguments
    "Callback for gc"             // Documentation string for the method
};

/* 
 * Enable GC monitoring.
 * This function adds the callback function to Python's `gc.callbacks` list.
 */
static PyObject* enable_gc_monitoring(PyObject* self, PyObject* args) {
    // Import the `gc` module
    PyObject* gc_module = PyImport_ImportModule("gc");
    if (!gc_module) {
        PyErr_SetString(PyExc_ImportError, "Failed to import gc module");
        return NULL;
    }

    // Get the `callbacks` attribute from the `gc` module
    PyObject* gc_callbacks = PyObject_GetAttrString(gc_module, "callbacks");
    Py_DECREF(gc_module);  // Release the reference to the gc module
    if (!gc_callbacks) {
        PyErr_SetString(PyExc_AttributeError, "Failed to get gc.callbacks");
        return NULL;
    }

    // Ensure `gc.callbacks` is a list
    if (!PyList_Check(gc_callbacks)) {
        Py_DECREF(gc_callbacks);
        PyErr_SetString(PyExc_TypeError, "gc.callbacks is not a list");
        return NULL;
    }

    // Check if the callback is already stored
    if (gc_event_callback_function != NULL) {
        Py_DECREF(gc_callbacks);
        PyErr_SetString(PyExc_RuntimeError, "GC monitoring is already enabled");
        return NULL;
    }

    // Wrap the internal `gc_event_callback` function as a Python callable
    gc_event_callback_function = PyCFunction_New(&my_method_def, NULL);
    if (!gc_event_callback_function) {
        Py_DECREF(gc_callbacks);
        PyErr_SetString(PyExc_RuntimeError, "Failed to create callable for gc_event_callback");
        return NULL;
    }

    // Append the callback to the `gc.callbacks` list
    if (PyList_Append(gc_callbacks, gc_event_callback_function) < 0) {
        Py_DECREF(gc_callbacks);
        Py_DECREF(gc_event_callback_function);
        gc_event_callback_function = NULL;  // Reset the global variable
        PyErr_SetString(PyExc_RuntimeError, "Failed to append callback to gc.callbacks");
        return NULL;
    }

    Py_DECREF(gc_callbacks);  // Release the reference to gc.callbacks
    Py_RETURN_NONE;  // Return None to indicate success
}

/* 
 * Disable GC monitoring.
 * This function removes the callback function from Python's `gc.callbacks` list.
 */
static PyObject* disable_gc_monitoring(PyObject* self, PyObject* args) {
    // Import the `gc` module
    PyObject* gc_module = PyImport_ImportModule("gc");
    if (!gc_module) {
        PyErr_SetString(PyExc_ImportError, "Failed to import gc module");
        return NULL;
    }

    // Get the `callbacks` attribute from the `gc` module
    PyObject* gc_callbacks = PyObject_GetAttrString(gc_module, "callbacks");
    Py_DECREF(gc_module);  // Release the reference to the gc module
    if (!gc_callbacks) {
        PyErr_SetString(PyExc_AttributeError, "Failed to get gc.callbacks");
        return NULL;
    }

    // Ensure `gc.callbacks` is a list
    if (!PyList_Check(gc_callbacks)) {
        Py_DECREF(gc_callbacks);
        PyErr_SetString(PyExc_TypeError, "gc.callbacks is not a list");
        return NULL;
    }

    // Check if the callback is stored
    if (gc_event_callback_function == NULL) {
        Py_DECREF(gc_callbacks);  // Release the reference to gc.callbacks
        PyErr_SetString(PyExc_RuntimeError, "GC monitoring is not enabled");
        return NULL;
    }

    // Find and remove the callback from the `gc.callbacks` list
    Py_ssize_t index = PySequence_Index(gc_callbacks, gc_event_callback_function);
    if (index == -1) {
        Py_DECREF(gc_callbacks);
        PyErr_SetString(PyExc_ValueError, "Callback not found in gc.callbacks");
        return NULL;
    }

    if (PySequence_DelItem(gc_callbacks, index) < 0) {
        Py_DECREF(gc_callbacks);
        PyErr_SetString(PyExc_RuntimeError, "Failed to remove callback from gc.callbacks");
        return NULL;
    }

    Py_DECREF(gc_callbacks);  // Release the reference to gc.callbacks
    Py_DECREF(gc_event_callback_function);  // Release the reference to the callback
    gc_event_callback_function = NULL;  // Reset the global variable

    Py_RETURN_NONE;  // Return None to indicate success
}

/* 
 * Add a reference to the foreign reference list.
 * This simulates a Java-requested reference.
 */
static PyObject* reference_add(PyObject* self, PyObject* arg) {
    ForeignReference* reference = (ForeignReference*)malloc(sizeof(ForeignReference));
    reference->local_object = arg;
    reference->remote_object = NULL;
    Py_INCREF(arg);
    reference_insert(&references, reference);
    printf("ADD %p\n", arg);
    Py_RETURN_NONE;  // Return None to indicate success
}

/* 
 * Method definitions for the module.
 */
static PyMethodDef HelloWorldMethods[] = {
    {"enable", enable_gc_monitoring, METH_VARARGS, "Enable clgc"},
    {"disable", disable_gc_monitoring, METH_NOARGS, "Disable clgc"},
    {"callback", gc_event_callback, METH_VARARGS, "Monitors the gc process"},
    {"add", reference_add, METH_O, "Simulate Java requested reference"},
    {NULL, NULL, 0, NULL}  // Sentinel
};

/* 
 * Module definition.
 */
static struct PyModuleDef helloworldmodule = {
    PyModuleDef_HEAD_INIT,
    "helloworld",  // Name of the module
    "A module with GC cycle monitoring",  // Module documentation
    -1,  // Size of per-interpreter state or -1 if state is global
    HelloWorldMethods
};

/* 
 * Initialization function for the module.
 */
PyMODINIT_FUNC PyInit_helloworld(void) {
    PyObject* m;

    // Initialize the foreign reference list
    reference_init(&references);
    generation = -1;
    phase = -1;

    // Initialize the types
    if (PyType_Ready(&SentinelType) < 0) return NULL;
    if (PyType_Ready(&ForeignType) < 0) return NULL;
    if (PyType_Ready(&PivotType) < 0) return NULL;

    // Create the module
    m = PyModule_Create(&helloworldmodule);
    if (!m) return NULL;

    // Add the Sentinel type to the module
    Py_INCREF(&SentinelType);
    PyModule_AddObject(m, "Sentinel", (PyObject*)&SentinelType);

    Py_INCREF(&ForeignType);
    PyModule_AddObject(m, "Foreign", (PyObject*)&ForeignType);

    Py_INCREF(&PivotType);
    PyModule_AddObject(m, "Pivot", (PyObject*)&PivotType);

    // Create a magic tuple that allows us (and only us) to create Sentinel
    PyObject* args = PyTuple_New(0);
    magic = args;

    // Create static instances of Sentinel
    sentinel_instance = PyObject_CallObject((PyObject*)&SentinelType, args);
    if (!sentinel_instance) {
        Py_DECREF(m);
        Py_DECREF(args);
        return NULL;
    }

    Py_DECREF(args);
    Py_INCREF(sentinel_instance);
    PyModule_AddObject(m, "sentinel", sentinel_instance);

    return m;
}
1 Like

Lets move the comments into the thread to make it easier to read.

Cross-Language Garbage Collection (CLGC) Demonstration Module

This module addresses the challenges of garbage collection across languages, specifically between Python and Java, by introducing hooks into Python’s generation 2 garbage collector. It demonstrates how Python can efficiently manage cross-language references through a process called internalization.


Internalization Process

Internalization is the process by which Python identifies isolated objects connected to foreign systems, delegates ownership of their lifecycle to the foreign system, and ensures proper cleanup of cross-language references. This ensures Python no longer holds responsibility for keeping foreign objects alive, allowing the foreign system to manage their lifecycle efficiently.


Mechanics

  1. Detection of Isolated Python Objects

    • During Python’s garbage collection traversal, the ReferenceManager identifies Python objects that are not connected to other Python objects except through the ReferenceManager.
    • These objects are considered “isolated” from Python’s perspective and are candidates for internalization.
  2. Depth-First Search (DFS)

    • The ReferenceManager uses Python’s traversal system to perform a depth-first search (DFS) on the reference graph rooted at the isolated Python object.
    • The DFS explores all references originating from the Python object to determine its relationship with foreign objects (e.g., Java objects).
  3. Discovery of Java References

    • During the traversal, if the ReferenceManager encounters a reference to a Java object:
      • It checks whether the Java object has already been visited by Java’s garbage collector during its cycle.
      • If the Java object has not been visited, it means the Java object is still reachable from Python but not actively held by Java.
  4. Delegating Ownership to Java

    • If an unvisited Java object is discovered during the DFS:
      • The relationship between the Python object and the Java object is sent to the Java side.
      • On the Java side:
        • The Java object is removed from the strong global reference list, meaning Java no longer holds the object alive directly.
        • The Java object is now held alive only by the weak reference originating from Python.
  5. Handling Cases with No Java References

    • If the depth-first search does not discover any Java references:
      • The Python object is added to the foreign_list in the ReferenceManager.
      • The Python object will remain on the foreign_list until Java breaks its weak link to the object.
      • Since no Java references were found, there cannot be a cross-language reference loop involving this Python object.

Key Scenarios

Scenario 1: Python Object References a Java Object

  • The Python object is isolated except for its reference to the Java object.
  • The DFS discovers the Java object, which has not been visited by Java’s garbage collector.
  • The relationship is delegated to Java, and the Java object is removed from the strong global reference list.
    Outcome: Python no longer holds responsibility for the Java object, and the Java object is managed by Java.

Scenario 2: Python Object Does Not Reference Any Java Object

  • The Python object is isolated.
  • The DFS does not discover any Java references.
  • The Python object is added to the foreign_list and waits for Java to break its weak link.
    Outcome: The Python object remains on the foreign_list until Java determines it is no longer needed.

Scenario 3: Cross-Language Reference Loop

  • The DFS discovers a loop involving both Python and Java objects.
  • The relationship is delegated to Java, and Java assumes ownership of the loop.
    Outcome: Python no longer holds responsibility for the loop, and Java manages its lifecycle.

Advantages

  • Efficient Garbage Collection: Delegating ownership of cross-language relationships to the foreign system reduces Python’s involvement in managing foreign objects.
  • Breaks Reference Loops: Internalization ensures that reference loops involving both Python and Java are broken, preventing memory leaks.
  • Optimized Resource Management: Objects are cleaned up promptly when no references exist on either side.

Edge Cases

Case 1: Java Object Referenced by Multiple Python Objects

  • If multiple Python objects reference the same Java object, the Java object will remain alive until all Python references are removed.

Case 2: Weak Links

  • If the Java object is held alive only by a weak link from Python, it will be garbage collected by Java once Python removes its reference.

Case 3: Synchronization Delays

  • If Python and Java garbage collection cycles are not synchronized, there may be slight delays in cleanup. This is not critical but could be optimized.

Python GC Phases

Phase 1: subtract_refs

  • Decreases reference counts for objects in the collection set. If an object’s reference count drops to zero, it is considered unreachable.
  • The arg parameter in the traversal callback (visit_decref) represents the parent object being traversed.

Phase 2: move_reachable

  • Identifies all reachable objects starting from the roots and moves them, along with their dependencies, into the reachable list.
  • The arg parameter in this phase typically represents the new list of reachable items (PyGC_Head*).

Phase 3 (Debugging Enabled)

  • Exploits properties of Python’s GC to pivot the Sentinel object to the back of the GC list. All objects afterward must be owned by a foreign object or collected. At this point, the structure and links between objects can be analyzed.

Implementation Notes

  • Internalization is only possible on the Python side because Java lacks a traversal mechanism to discover relationships.
  • Proper synchronization between Python and Java garbage collection processes is essential for efficient cleanup.
  • Ensure the is_broken function is implemented correctly and efficiently, as it plays a critical role in determining the lifecycle of objects.

Future Considerations

  • A future PEP could define the minimal API required to support CLGC:
    1. A method to report the type of GC being executed at the start of the cycle.
    2. Callbacks at each major phase to allow state alterations.
    3. Methods to check object states during each phase, accounting for Python objects potentially existing in multiple states within the same phase.

This module simulates interactions with Java objects and demonstrates how Python’s GC can be extended for cross-language garbage collection.

Here is the minimum API that would be required to support CLGC.

/**
 * Macro: PyGC_VISIT_DFS
 * ---------------------
 * Implements a depth-first search (DFS) traversal for garbage collection. This
 * macro temporarily modifies the _gc_prev field of a garbage collection
 * header (PyGC_Head) to mark it as visited during the traversal, then calls
 * the objects tp_traverse method to recursively visit its references.
 *
 * Parameters:
 * - op: A pointer to the Python object being visited (PyObject*).
 * - visit: The callback function to be applied during traversal (visitproc).
 * - arg: Additional arguments passed to the callback function.
 *
 * Usage:
 * - This macro is used in the internal garbage collection process to analyze
 *   object references and determine reachability.
 */
#define PyGC_VISIT_DFS(op, visit, arg) \
  { AS_GC(op)->_gc_prev ^= PREV_MASK_COLLECTING; \
    tp->tp_traverse(op, visit, arg); \
    AS_GC(op)->_gc_prev |= PREV_MASK_COLLECTING; }
 
/**
 * Typedef: clgcfunc
 * ------------------
 * Defines the function signature for a callback used as a reference manager
 * during Pythons garbage collection process. This callback is invoked at
 * various phases of the generation 2 garbage collection cycle to manage foreign
 * objects and their references.
 *
 * Function Signature:
 * - int (*clgcfunc)(int phase, visitproc visit, void* args);
 *
 * Parameters:
 * - phase: Indicates the current phase of the garbage collection process.
 *   - 0: A new garbage collection cycle is beginning.
 *   - 1: The decrefs phase is complete, and objects with zero external
 *     references are subject to collection. Foreign objects should be visited
 *     at this phase to treat them as normal objects.
 *   - 2: The reachability analysis is complete. Objects not yet reachable
 *     will be collected. Foreign objects still needed should be recovered at
 *     this phase.
 *   - 3: The garbage collection cycle is completed.
 * - visit: A callback function (visitproc) used for traversing object
 *   references during the garbage collection process.
 * - args: Additional arguments passed to the callback function, typically
 *   used for context or state management.
 *
 * Returns:
 * - 0 on success.
 * - Non-zero values can be used to indicate errors or specific conditions
 *   during the garbage collection process.
 *
 * Usage:
 * - Implement this function type to define a custom reference manager for
 *   Python's garbage collector. The reference manager should handle foreign
 *   object tracking and cleanup during the specified GC phases.
 *
 */
typedef int (*clgcfunc)(int phase, visitproc visit, void* args);

/**
 * Function: PyGC_IsReachable
 * --------------------------
 * Determines whether a given Python object is reachable at the end of the
 * garbage collection reachability phase.
 *
 * Parameters:
 * - obj: A pointer to the Python object (PyObject*) being checked.
 *
 * Returns:
 * - 1 if the object is reachable.
 * - 0 if the object is not reachable.
 *
 * Notes:
 * - This function should only be called at the end of the reachability phase
 *   (phase 2). Calling it at any other time during the GC cycle will produce
 *   undefined or unexpected results.
 *
 * Usage:
 * - Use this function to verify whether an object has been marked as reachable
 *   during garbage collection.
 */
int PyGC_IsReachable(PyObject *obj);

/**
 * Function: PyGC_InstallReferenceManager
 * --------------------------------------
 * Installs a custom reference manager for the Python interpreter. The reference
 * manager integrates with Pythons garbage collector to track and manage
 * foreign objects during a generation 2 garbage collection cycle.
 *
 * Parameters:
 * - manager: A callback function (clgcfunc) that will be invoked during
 *   different phases of the garbage collection process. The callback function
 *   signature is:
 *   int manager(int phase, visitproc visit, void* args)
 *   - phase: Indicates the current phase of the garbage collection process.
 *   - visit: A callback function used for traversal during the GC process.
 *   - args: Additional arguments passed to the callback function.
 *
 * Returns:
 * - 0 on success.
 * - -1 if a reference manager is already installed.
 *
 * Notes:
 * - Only one reference manager can be installed at a time. Attempting to
 *   install a second reference manager will result in a runtime error.
 * - The reference manager is responsible for ensuring proper tracking and
 *   cleanup of foreign objects during garbage collection.
 *
 * Usage:
 * - Use this function to integrate custom foreign object tracking into Pythons
 *   garbage collector.
 */
int PyGC_InstallReferenceManager(clgcfunc manager)
{
    if (reference_manager != NULL) {
        PyErr_SetString(PyExc_RuntimeError, "Only one reference manager allowed");
        return -1;
    }
    reference_manager = manager;
    return 0;
}

And this is how it may be used.

/* 
 * Perform a depth-first search (DFS) using Pythons traversal mechanism.
 * This function analyzes relationships between foreign incoming references and foreign outgoing ones.
 */
static int internalize_trace(PyObject *op, void *arg) {
    if (PyGC_IsReachable(op))
        return 0;

    // Check if the object is a foreign reference
    PyTypeObject* tp = Py_TYPE(op);
    if (tp->tp_free == Foreign_free) {
        internalize_add(op, arg);
        return 0;
    }

    // Perform DFS traversal for objects bound for garbage collection
    PyGC_VISIT_DFS(op, internalize_trace, arg);
    return 0;
}

/* 
 * Analyze linkages between foreign references and Python objects.
 * This function is called at the end of the GC process to discover reference loops.
 */
static void internalize_analyze() {
    ForeignReference* current = references.next;
    while (current != &references) {
        PyObject* op = current->local_object;

        // Perform DFS traversal to find reference loops
        internalize_start(op, NULL);
        internalize_trace(op, NULL);
        internalize_end(op, NULL);
        current = current->next;
    }
}

/** 
 * Here is a sample of how the reference manager hooks are used.
 */
int ReferenceManager_trigger(int phase, visitproc visit, void* args)
{
    printf("trigger %d\n", phase);

    // A new GC cycle is beginning
    if (phase == 0)
    {
        renew = 0;
        skip = 0;
        return 0;
    }

    // decref phase is completing
    if (phase == 1)
    {
        visit_references(visit, args);

        // Any reachable foreign object should renew or request a lease.
        renew = 1;
        return 0;
    }

    // reachable analysis is completing
    if (phase == 2)
    {
        renew = 0;

        // Analyze reachablity and inform Java of disconnected segments.
        internalize_analyze();

        skip = 1;

        // Add our references to the reachability to keep them alive until Java terminates them
        visit_references(visit, args);
        return 0;
    }

    // gc cycle is ended.
    if (phase == 3)
    {
        // Notify Java that new leases are in force.
        renew = 0;
        skip = 0;
    }
    return 0;
}

I’m just a curious bystander here, but would this work on Android?

It sounds like it might be useful for apps using e.g. Beeware’s Toga, which uses Chaquopy to interop with Java and currently can’t handle reference cycles.

(Android was added as a supported platform in Python 3.13 with PEP 738.)

JPype works with Android at least at the demonstration level. Alternately PyJNIus could use the same mechanism to for Android support. Though in the case of PyJNIus they would need modifications to switch from strong global references to the double weak for holding their objects. I am not familiar with Chaquopy.

The proposed method is generic for any system in which the Python relationships can be delegated to another gc to be resolved. Rather than implementing the bridged CLGC, this idea is simply to provide a minimal set of hooks and examples such that others can implement integrated garbage collection in a seamless fashion. Thus it would be effective with Android, Java, C# .net or a C++ based garbage collection method with similar parameters to virtual ones. The same hooks could be used to construct distributed garbage collection. One could in principle chain multiple instances of Python running on different machines so long as it was a hub and spoke type network in which machines are not creating arbitrary connections amongst themselves that require GC. This is because Python itself would satisfy there requirements for shared GC under this proposal. As to whether it will work with other existing distributed solutions I would have to defer to experts as DGC is a broad and diverse topic.

So, the proposal is to add the minimal API to CPython?
Given their nature and use case, I think they should be unstable (CPython version specific), if added.
IMO, debugger/optimizer hooks of a similar calibre are not unherad of.

Some questions the PEP should raise (and ultimately answer):

  • Who would maintain it? How would it be tested for correctness, without requiring Java for the test suite?
  • How badly does this lock in the current implementation of the GC?

There is no requirements to run java to test this. If you compile the example I provided and run it you will see that it performs the analysis and just prints the results of reachability to the screen. For testing we can simply create a user of this api that creates a bunch of scenarios (linked through dict, linked through list, was linked through list and recovered, etc) and collect the linkages in the test module to be probed against.
In other words it is completely reasonable for me to be required to supply a full mock system that would be exercised and help to maintain it. Incidentally I already had to do much of that work because I needed to verify this was working on all the expected forms: I, w, Z, O, 8, a, 9, 6, etc. (Hint view the letters as graphs in which an end point is held either by one side or the other.)

Alternatively we could implement a cross Python instance (Python-socket-Python) which is a form of dgc that would be usable for general Python users. I have wanted such a beast during my code testing with pytest for a long time in which I create a subinterpreter in another process and then deal with it remotely. (I have an implementation that does a one and done execution using pipes, but it would be nice if one could keep it up for a long time and manipulate it long enough that gc is important). In other words build something general purpose on this rather than niche and it becomes easier to justify.

As for the cycles if the gc, I could see this operating as is with a full mark and sweep gc just fine. The difference being that it would skip directly from over the decref phase entirely or use it for a different purpose. We would need to document the requirements of this module a bit differently to support it but it could run with many different styles of gc.

Phase 0 tells manager that gc is preparing a cycle.
Phase 1 (repeatable) requires manager to visit all elements. (Use the visit/args supplied)
Phase 2 (once) informs manager that all items that are not visited now will be collected and DFS can be executed, (use visit/args to rescue)
Phase 3 informs manager gc is done so it can inform foreign side of a new lease.

Under current Python it does 0-1-2-3. If you turn on the debug which does a post visit for statistics you get 0-1-2-1-1-3. A new Python gc implementation may do 0-1-1-2-1-3. (Any users of such an API can be given a hardness test in which extra phase 1 are tossed in). As the traverse/visit api which this is exploiting is already tied into Python you don’t lose much. One could even run a different copy of the reference manager for each subinterpeter. So long as only that items that were reachable in the gc scope are being collected it would operate normally.

What does become restricted is gc styles which a only portion of the gc is analyzed much if the time. For example if one imagines a gc which was not stop the world but instead per thread gc. Generational gc does not pose an issue (as this code is works fine with the current Python generational gc). Of course even then so long as the gc has an occasional full gc cycle it could still operate to deal with irresolvable loops, though perhaps not as fast as they get built. In other words, this proposal doesn’t appear lock Python much. The other type of gc that is won’t support is blind gc like some of the C++ libraries use where you don’t know what is a pointer and just treat all memory as potential references. But that is a bandaid for C++. Why would Python whuch is aware of object locations ever go to that? I concluded it was unlikely.

The question has to be how far can CPython drift from its current model and not have everything fall apart? Will this proposed idea fall apart before or after that point?

During the implementation of this proposal I asked myself was my current implementation of handling gc wrong as my Java side creates objects with no linkage at all it Python and then I come back later and clean it up from the references I held. That sent me into a full out panic. Won’t my objects get garbage collected?!
Where is the equivalent of NewGlobalRef? Java would have blown up long ago under that abuse!
I started reworking my module to make sure everything was in a PySet that was reachable until I was done with it. Then I realized that as written in the current Python implementation one extra reference will make an object immortal. I thought that was interesting. So can’t I just add a standard black-gray-white gc as a generation 3 to Python? Turns out I could by sending my own set of traverse/visit ( i sketched it out but was to lazy to implement it)… then discovered it would be pointless. Likely many modules including CPython itself are using this assumption implicitly. If not I missed the PyObject_LockGlobalReference call.

In short, I think that this idea doesn’t tie Python down more that current assumptions already do. And that it may be generally useful. If you would like to try an experimental full stop the world gc in CPython I am up to try. I don’t know enough about Graalvms gc model to know if they have a point where reachablity linkage analysis can be performed. But if it does then this proposal still works.

That doesn’t mean that this isn’t a candidate for the unstable category as defined. But perhaps it is more general that my niche product view is giving it.

Hope I answered your question.

Secondary follow up… if we do want this idea to support partial collections (such a per thread), we likely need to add one more phase call. That is the manager will need to be able to test which leases are subject to collection, that way it won’t expire leases on objects that can’t be collected. For Python that just means a call before or after the current decref stage.

Lets use a enum rather than numbers for clarity.

  • Phase START informs manager that a new gc cycle is starting.
  • Phase QUERY informs manager that items can be inspected to see if they are in the collectable set.
  • Phase VISIT tells manager it must visit all references in the collectable state.
  • Phase REACHABLE tells manager it can perform reachablity tests and rescue items.
  • Phase END closes the current collection.

We would need two query calls in the API, IsCollectable, IsReachable. Both exist in CPython currently. So current Python would do START QUERY VISIT REACHABLE (VISIT*) END. Though I can see there being a swap. (I know it does the collectable marking early, but i don’t know which order will be the least cost.)

My idea remains the same, give an absolutely minimalist (install/remove, is-collectable, is-reachable, one function def) and leave it to the users to decide how it is used. If they cant handle some pattern of gc actions in some future Python 3.27 ==> their problem. All you have to guarantee is that those four functions operate enough that reachablity analysis is possible which is a low bar.

Remember that if we miss in this analysis it simply ends up with failure to report a reference loop which may have been resolvable. If we get it again in another gc cycle it will just get collected anyway. This mechanism is providing hints to help break cycles. Users should still avoid unnecessary cycles and do good memory management practices ( a simple del call when the user was done would have solved the loop issue).

Clgc is a support method to deal with the fact that many users aren’t aware of the how gc operates having never viewed the bad old days where gc was never provided.

As for the unstable designation, I have long hoped to get JPype to run with other Python implementations. Thus if there were a way to make reference managers work not just in CPython but also PyPy it would be great.

Hi, I’m the developer of Chaquopy. I don’t know much about the internals of garbage collectors, so I’m not sure how much I can add to this conversation, but I have a few questions.

in the case of PyJNIus they would need modifications to switch from strong global references to the double weak for holding their objects

What do you mean by “double weak”? OK, I see there are some details of this in the top comment, but they weren’t copied into the more readable second comment for some reason.

Chaquopy uses strong references for all Java objects which are exposed to Python code, and vice versa. Our approach has been that a memory leak is always preferable to a dangling pointer. However, we do have some tests to verify that these references are released when the owning object is destroyed.

It checks whether the Java object has already been visited by Java’s garbage collector during its cycle.

This would require pretty deep integration with the Java garbage collector, wouldn’t it? In particular, it would require a separate implementation for every JVM, and maybe even different versions of the same JVM.

Internalization is only possible on the Python side because Java lacks a traversal mechanism to discover relationships.

What relationships do you mean? If the “DFS discovers a loop involving both Python and Java objects”, doesn’t this require it to be aware of relationships between Java objects? For example, there could be a loop containing one Python object and two Java objects, like this:

P1 -> J1 -> J2 -> P1

Use of strong references in both directions will produce reference loops unless only one side can hold references. This is the dilemma faced by JPype. Once Java proxies were introduced it started creating memory leaks. We solved this by making one side weak which requires that Python maintain a reference to its own object. That is workable but leaves the potential for a dangling reference. As we are now trying to make our bridge bidirectional (both sides can manipulate the others objects) that is untenable.

The point of CLGC is one removes all strong references between the languages and instead makes a reference manager responsible within the language responsible for holding the leases. This is how distributed garbage collectors in which symmetric GCs on multiple computers operate. There are some DGC that are cross language and therefore do operate with more the one GC. As creating a strong reference between two computers on different nodes is impossible a centralized manager must live in global space and connect a strong reference within the language itself. That was the starting point for CLGC. Unfortunately no DGC library in Python nor Java I was able to locate had any local hooks I could exploit.

This is where DGC and CLGC differ. In DGC the leases are typically time based and each node must ping the other to establish the other is alive. A missed ping and the references become dangling. Not every DGC even is able to represent cyclic connections much or less resolve them. CLGC instead is tied to pace of the two garbage collectors directly. Whenever python runs it largest GC cycle the graph analysis takes place. If something was no longer connected on the Python side then is support is transferred to the Java side in the form of a change of ownership from the reference manager to the cross language links itself. When java runs its GC one of two things are going to happen. Either the chain is supported in some way through one of the languages P1->J1->J2->P2->P3->J3 (where J2 to J3 were the unsupported leases the Python reported in the last cycle and J1 is a lease owned by Java tied to a P1 lifespans) or it discovers that it had been a reference loop. P1-> J1-> J2-> P2->P3->J3->J4->P4->P1 (Both J2->J3 and J4->J1 got reported. Java sees it as J1->J2->J3->J4->J1 after the Python report. Thus unless there happens to be static J5->J3 to hold the loop up everything will get GC on the next cycle).

As for portability of this scheme, there is no reason for ANY modification to the Java GC. All that happens is moving around objects in lists and sets. Java is just a patsy in this scheme unaware it is being used. There is no integration at all beyond a class which is holding a reference to itself as a singleton. That connects the lease list within Java. Just like JPype it works on the Oracle, OpenJDK, and Android gcs.

So long as the reference manager is holding leases properly and no loop is cleaned up until after all references are lost on both languages there is never a chance of any dangling references.

For more details on the Java side of implemention details see the diagram in the JPype repo as I am trying to keep this thread focused on the Python requirements.

@mhsmith If you want to use this in Chaquopy (so that we have two independent users of the API), I’d be happy to sponsor the PEP.

Also cc @hoodmane & @brettcannon – I don’t really know how much about WASM in this respect but I assume there are some plans/APIs/workarounds for cross-language GC?

1 Like

Thanks to Karl for coming up with this, but I don’t think I could justify spending the time to implement it in Chaquopy in the foreseeable future. Although it’s a known problem, I don’t think any of our users have ever complained about it, and it has some reasonably easy workarounds.

Don’t you get it mostly for free if JPype implements it? The reference manager for both sides will be available and all you will need to do is call ManagerNewReference in place of NewGlobalReference and add a ManagerDereference to recover the Java local reference prior to starting work on a java object. Same thing in for the Python to Java layer.

That is how I will transition once the api is in place.

To have use for the api a module has to be at the level where both sides are storing references to each other. There are 3 Java codebases that fit the description (JPype, PyJNIus and forks, and JEP). The other two can ride JPype coattails if they chose to. Though it sounds like one group is happy with work arounds.

If not these, the next most likely users are pythonnet, wasm, and maybe golang. But I don’t know if these have the same desire for seamless. A primarily one side bridge is going do either no loops or depend heavily on the weak reference with dangling solution. WASM and golang generally don’t have reflection capabilities that it is easy to form reference loops. (Yes there is a golang reflection api but my buddy that is a golang fanatic stated it doesn’t get used the same way as Java) WASM doesn’t even necessarily have a gc that is native as it only provides a memory block interface and users build their own gc on top (if my buddy gave me correct info.)

That leaves pythonnet as the most likely cosponsor. I don’t know them though given the relationship between Java and C# I suspect the same solution I come up with in Java will likely work for them with just a LLM translation on C# side. There may be differences as the C# api is more native than Javas JNI. This like the strings api is very critical to JPype so I am willing to put in the time to work with some out of my area to get it accomplished.

If it’s that simple then it might be feasible, though I don’t fully understand the design yet, and I don’t have time to look into it any deeper at the moment.

Its fine if you can’t deleve into the details of gc. I dragged you (and several others) into the conversation and you were the only one of the interest parties that took the time to look over a what is a very complex interaction and give an opinion. Thank you!

The implementation for a Java module will be as I describe, just another api bolted on to JNI that does the dirty work of detecting and removing cycles. The reference you will need to hold will be slightly larger as you need both a jweak and the lease identity. Jweaks can’t be passed to Java calls but they can be used for NewLocalReference. That is all you need to know to use the api other than borrowing the code for the two reference managers and how to properly start and stop them. As you are on Android you also get to dodge the thorny problem of the JVM disappearing on you.

All three of the code bases that run on Java can use my api. I write detailed developer guides (though it takes a long time) because I may get run over by a bus at any time. So once developed it should be a easy call as to whether you want to support or not.

So maybe android can be another user of this API. But it would still be nice to get someone who isn’t Java as a sponsor as it would be much more compelling in terms of need. JPype already covers Kotlin snd Scala but that doesn’t make them independent either. Pythonnet has yet to respond to my issue request on github. I am still at a loss of what other modules are have similar gc issues.

Some appropriate projects to contact about WASM would be Pyodide and PyScript.

1 Like

From a WASI perspective, there is no cross-language GC in the way that you might be thinking in terms of Python when you’re using a CPython interpreter (basically it’s pass-by-value). There is general special-casing for JS and DOM objects in the browser, though, when it comes to the WASM spec.

Now, WASM does have a GC extension that lets languages that compile down to WebAssembly to have a GC provided for them. But that does require there be a Python-to-WebAssembly compiler.

2 Likes