In my recent-ish thread about revising PEP 649, Petr brought up the possibility of enhancing .pyc files so we can add additional lazy-loaded stuff. I was discussing this in a private email thread this morning, and had a brainstorm about how it all could work: the API, the semantics, and the implementation.
Quick recap, the current structure of a .pyc file is as follows:
<magic value>
<4-byte flags int>
<8 bytes of stuff, contents vary depending on flags>
<module code object>
AFAIK this structure is invariant for CPython. The “8 bytes of stuff” can be either a 4-byte datetime stamp (seconds since epoch) and a 4-byte size, or an 8-byte hash of the source.
I’m gonna call our new thing an “overlay”. This isn’t technically an overlay in the traditional sense, but it’s a good enough word to hang the concept on for now–we can find a better word later. So.
An “overlay” is an optional code object appended to a .pyc file, referenced by a “name”. An overlay “name” is a marshallable, hashable, constant value–a string, an int, etc.
“Loading” an overlay means running the overlay code object in an existing module’s namespace. This can do whatever it needs to in the module–add new attributes, modify existing ones, run arbitrary code. There is no mechanism to “unload” an overlay.
The overlay loading machinery remembers what overlays have been loaded by maintaining a __overlays__
attribute in the module’s namespace. It’s unset when the module is new, and only added after loading the first overlay. It’s a set object containing all the names of the loaded overlays.
We add a new function to load overlays, something like
int PyImport_LoadOverlay(PyObject *module, PyObject *name, int force_reload);
You’d call it with an already-loaded module object, the name of the overlay you want to load, and the force_reload
flag. It’d return nonzero for success and zero for failure. force_reload
relates to caching: if force_reload
is zero, it doesn’t re-load an overlay that’s already been loaded; if it’s nonzero, it always loads the overlay, whether or not it was previously loaded. The return value could indicate what happened, like 1 for “successfully loaded” and 2 for “already loaded, didn’t reload”.
We’d also provide this function in Python, presumably in the sys
module. Something like
def load_overlay(module, name, *, force_reload=False):
...
The simplest possible way to store overlays would be appending alternating names and values of the overlays to the .pyc file. In this example, our .pyc file has three overlays with the names 'foo'
, 'bar'
, and 'third_thing'
:
<magic value>
<4-byte flags int>
<8 bytes of stuff, contents vary depending on flags>
<module code object>
<constant, string 'foo'>
<overlay code object 'foo'>
<constant, string 'bar'>
<overlay code object 'bar'>
<constant, string 'third thing'>
<overlay code object 'third thing'>
But this would force us to wade through N overlays to find the one we wanted.
With only slightly more data–one 4-byte int per overlay code object, and two additional 4-byte ints–we could create a structure (inspired by zip files) that could still be written in one pass but lets us load any overlay with at most three seeks.
<magic value>
<4-byte flags int>
<8 bytes of stuff, contents vary depending on flags>
<module code object>
<overlay code object 'foo'>
<overlay code object 'bar'>
<overlay code object 'third thing'>
<number of overlays, 4-byte int> # start of "directory
<constant, string 'foo'>
<absolute seek offset for overlay code object 'foo'>
<constant, string 'bar'>
<absolute seek offset for overlay code object 'bar'>
<constant, string 'third thing'>
<absolute seek offset for overlay code object 'third thing'> # last entry in "directory"
<absolute seek offset to number of constants>
In practice we could get it down to two seeks or even one. For example, we could seek to EOF-4k bytes and read 4k bytes. With luck, that’ll get you the entire “directory” and the ending absolute seek offset, and if you’re extra-lucky the overlay you want to load would be in that 4k chunk too.
Any good?