[re-posting here for visibility, should have done both at the same time!]
Cinder is Meta’s performance-oriented version of CPython 3.8. It has been in use as the production Python behind Instagram server for years, as well as powering various other Python applications across Meta.
We are interested in making the improvements in Cinder more broadly available to the Python community. As a collection of features and components built on top of CPython, we’d like to:
Merge the feature into CPython for core features that are highly coupled with CPython internals (after we port it over to the main branch)
For features that can work as extension modules (like Cinder JIT) -
extract them as a new open source standalone pip-installable extension module.
To enable the pip-installable extension module, we will need to add a few extension hooks to CPython,
which we intend to carve out and port over to the main branch in order to merge upstream.
Please read more about our upstreaming strategy and a few Cinder features in this document:
Complete source code of our CPython fork:
We started filing bpo issues for a few of the pieces we want to merge upstream (links are in the google doc), and are looking forward to feedback from the community on the issues, overall approach, or any feedback on the features themselves.
Thanks for writing this up!
As you note, a lot of this work overlaps with the Faster CPython project, and if you can coordinate with them I don’t see why the parts couldn’t be merged. Pull requests for the bpo issuses are the next step :)
I wonder how much of the Lazy Imports would be doable using a custom loader with a custom .pyc compilation step, to avoid needing this in core CPython right now. Is that worth considering? (Alas, I haven’t actually looked at the code.)
We’re definitely working with the faster CPython folks and making sure we’re coordinated there and have been opening the bpo’s for the immediate concrete changes!
On the lazy imports issue, I think what you’re suggesting is probably possible, but it’s not going to be able to get the abstraction 100% right, and it’s going to take a significant performance hit.
The initial implementation of this was actually something which hooked into the various opcodes in the interpreter loop and checked for the deferred objects. And indeed an equivalent implementation could be done at the .pyc level. For example LOAD_NAME is followed by various opcodes to check if the object is deferred and resolve it rather than embedding that in the interpreter loop. But even with having the implementation in C there were perf regressions using this approach and obviously it’d be much slower in interpreted code. But another problem with it was it was just hard to track down all of the various places where it was possible for the deferred objects to escape. For example it led to having to have logic in LOAD_FAST to handle deferred objects which is kind of terrible. And fundamentally because the globals are just in a normal un-abstracted dict that is fully exposed once that leaks so do the deferred objects.
Both the performance concerns and the escaping concerns led to the maybe dramatic decision to embed the logic in the dictionary object, which actually was performance neutral - and that’s because we could leverage the existing dk_lookup mechanism to only have any cost for dictionaries which had deferred objects in them. And it solved the problem of deferred objects escaping because it was handled at the source of truth in the dictionary.
So the performance concerns aren’t huge if people just wanted to experiment with the feature as an optional add-in, but the leaky abstraction is likely going to lead to a lot more issues with being able to successfully use it in large existing programs.