I haven’t completely given up on this idea and I have a few updates. First, I was worried about how to deal with the “trash can” mechanism but now I think that’s a non-issue. We can use a separate container object to hold the trash. That avoids blowing up the C stack if you dealloc deeply nested stuff. It does introduce the possibility of an allocation failure in the decref code path (growing of trash container fails) but that doesn’t seem a fatal flaw.
I was struggling to decide how to deal with different object sizes (e.g. size class in the obmalloc code) and with “large” objects (i.e. stored outside the arenas, not by obmalloc). I now think those are solvable problems as well. One idea I had recently is that the high bits of the refcnt field can be (ab)used for various things. On a 64-bit machine, you would never need all those bits for the count. So, we could steal some to hold info. E.g. flag if the object is a container, the size class and flag if it is “large”. Having that info directly in the object structure would make the GC traverse process a lot faster.
I have the radix tree version of obmalloc working:
The performance is good, basically the same as the current obmalloc implementation. The radix tree would have to be extended to hold the GC head info (maybe get another level of nodes). I’m encouraged that it works as well as it does in the above PR.
The main reason I’m still interested in this idea is that it could be useful to implement a mark-and-sweep collector to replace the current cyclic GC. See the switching from refcounting to libgc discussion. Also, fixing Copy on Writes from reference counting.