Creating a handler for importing a new file extension

crj · January 27, 2026, 9:54pm

I want to augment the import machinery to support importing from files with a .foo extension using FooLoader.

While I wasn’t expecting that to be super-easy, I assumed there was at least some legal-feeling way to do it.

After staring at 5. The import system — Python 3.14.2 documentation and importlib — The implementation of import — Python 3.14.2 documentation and scratching my head and digging around, the best strategy I could see was:

Search sys.path_hooks for an entry with a __qualname__ that begins 'FileFinder.path_hook.'
Extract sys.path_hooks[i].__closure__[1].cell_contents
Replace sys.path_hooks[i] with a new FileFinder.path_hook listing all the existing loader_details plus my new entry

OK, an alternative would be to hard code the default loader_details into my code rather than jumping through hoops to extract them programmatically, but that feels even more likely to break in future versions of Python, plus has the significant disadvantage that it wouldn’t cope with multiple modules each attempting to add their own file extension.

Have I overlooked some clean way to do this?

JamesParrott · January 28, 2026, 12:10pm

This should be straightforward, without having gone into the details, by adding a custom finder or combined finder/loader to sys.meta_path (or however they are registered). That object’s find_spec method can run any code desired, and takes a name argument. For example, it should be simple to loop over each dir_ in sys.path and check if f"{dir_}/{name}.foo" is a file (on Posix anyway- use pathlib instead for platform independence). More code is needed for packages, e.g. to support __init__.foo (I would not assume that automatically plays nicely with relative imports). From there, personally I’d copy the .foo to a temp folder in sys.path with a .py extension, and invoke the normal import machinery.

James Murphy managed to import from the cloud https://www.youtube.com/watch?v=2f7YKoOU6_g (please note his security warnings). His code could be easily adapted (and made safer! ).

crj · January 28, 2026, 3:22pm

That doesn’t seem at all the right place to be implementing it: by my understanding, meta paths are for finding modules, not individual files within a module. The specialisation point for individual files (at least for modules located by PathFinder rather than the mechanisms for built-in modules, etc.) is sys.path_hooks .

The trouble is, that by default contains just the ZIP file importer and the FileFinder… which was given its list of file suffix and corresponding loader hooks when it was constructed, long before your program begins running.

So I replace it with a new FileFinder that supports more suffixes. Which is fair enough, apart from the above-mentioned underhanded tricks for:

Identifying the FileFinder hook in sys.path_hooks (the hook is a closed function, not an actual FileFinder or bound method)
Extracting the list of suffixes and loaders from the function’s closure

Those two steps are hacky and fragile, which is why I was hoping someone knew of an alternative.

JamesParrott · January 28, 2026, 3:55pm

by my understanding, meta paths are for finding modules, not individual files within a module

I suggest you re read the documentation, you yourself posted.

Meta hooks are registered by adding new finder objects to [sys.meta_path]

When the named module is not found in sys.modules, Python next searches sys.meta_path, which contains a list of meta path finder objects. These finders are queried in order to see if they know how to handle the named module.

Like I said, I’ve not gone into the details. That’s your job.

I’ve given you an example, that I trust works. Adapt it.

I’m sorry if you feel any steps are hacky, but the whole purpose of the exercise, is to import a .foo file instead of following the normal naming convention (and messing around with the import system is always bug prone).

crj · January 29, 2026, 3:00pm

I’ve stared at this even harder, and I remain convinced that implementing this via sys.meta_path is not the answer, here.

To implement it there:

PathFinder has no specialisation points other than sys.path_hooks.
Therefore, to avoid using sys.path_hooks, one must either shadow the implementation of various PathFinder behaviours, or do without:
- Iterate through sys.path and/or the provided path
- Maintain a shadow of sys.path_importer_cache, to hold all our secondary file finders.
- Recursively populate that cache at need, from our own shadow of sys.path_hooks
- Attach to that shadow of sys.path_hooks a FileFinder.path_hook that knows about (solely) the .foo extension.
…and then you still need to implement your actual custom loader!

That’s a lot of existing code that has to be duplicated. Otherwise one will end up breaking top-level .foo modules, or .foo modules within namespace packages. Even then, having a second FileFinder instance per directory would double the amount of stat/listdir churn during import.

So why not at least mitigate much (though far from all) of that by adding a second FileFinder.path_hook to sys.path_hooks? Because any FileFinder.path_hook will give you a FileFinder that searches only for the extensions it understands.

Bear in mind that we might have a package like this:

my_package/
    __init__.py
    module1.py
    module2.foo

…in which case neither the standard FileFinder.path_hook already in sys.path_hooks nor the .foo-specific one would result in a finder that could find all submodules. The only solution I can see would be to have two FileFinders per directory, in two separate caches.

Again, it feels as though the correct specialisation point for adding a new importable file type is the loader_details embedded in the FileFinder.path_hook which is already in sys.path_hooks; it makes fundamental sense for a list of supported suffixes to be the place to put a supported suffix. But that isn’t designed to be augmented, hence the evil jumping through hoops I’m intending to perpetrate unless there’s a realistic alternative.

crj · January 29, 2026, 3:23pm

James Murphy managed to import from the cloud https://www.youtube.com/watch?v=2f7YKoOU6_g (please note his security warnings). His code could be easily adapted (and made safer! ).

I feel I should emphasise that that example is inapplicable to what I’m doing:

That video is importing the same kind of thing from somewhere else
I want to import a different kind of thing from the same place

sirosen · January 29, 2026, 3:41pm

I think you may be doing us, and therefore indirectly yourself, a disservice by not sharing what the motivation is for this project. So far all I have is that you want to give valid Python files a nonstandard suffix. But I don’t know why you want this, and that may be the core thing at issue.

Except, why should this be a desirable extension point?

The typical use cases for import system extension are “find modules in a specialized storage system, not the filesystem”. Customization of the path hooks, rather than the meta path, is too late for the common case. And all use cases around customizing imports are relatively rare.

You already seem to have a good handle on the fact that you can duplicate code which wasn’t designed to be extended in the way that you want, and modify that to suit your needs. That sounds to me like a good solution. Can you explain why it’s not satisfactory?

Somewhat separately, I want to note that you provided an example, and it immediately raises two concerns in my mind.

my_package/
    __init__.py
    module1.py
    module2.foo

The first is that not every module is part of a package. e.g., move module2.foo up a level. I think this will require you to write a meta path finder?

The second is name conflicts. What happens if module2.py gets added to the above?
Naturally you can define a precedence order but it gets back to it being unclear why you are doing this.

crj · January 29, 2026, 4:07pm

In general terms, I am wanting to define a file extension for stuff which, though not a Python program, can nonetheless be compiled into a Python module. This could happen by transforming the file contents into Python code, compiling an AST, or similar. In any case, the core of my approach is to make a class that inherits from SourceFileLoader and overrides source_to_code().

That gives some important advantages over techniques such as providing a Python function that loads a resource file and returns some kind of namespace:

The machinery for caching compiled .pycs works as normal, meaning the (potentially costly) source_to_code override isn’t run unnecessarily.
Similarly, the module finds its way into sys.modules like any other, saving reloading it when it’s used from multiple modules.
You get to use from ... import on it.

The two specific initial examples I have in mind are:

Facilitating making GraphQL queries from Python by taking a .gql file full of queries and exposing each as a function taking the required parameters (including various additional tricks like converting paginated queries into Python iterators).
Creating a templating system that leverages Python 3.14’s t-strings.

…though I can see plenty of other similar intriguing use cases.

crj · January 29, 2026, 4:17pm

Can you explain why it’s not satisfactory?

There are only two realistic options, both of which entail studying the source code of importlib:

Create classes which mimic parts of it
Specialise stuff at undocumented points not intended for specialisation

Both of those feel extremely fragile to changes between Python versions, and to other people doing the same kind of thing in other projects.

The first is that not every module is part of a package. e.g., move module2.foo up a level. I think this will require you to write a meta path finder?

There are two options:

Only allow this mechanism to be used from packages, and say __init__.py should import this mechanism before importing any modules within the package (so that it is certain to be in place before the FileFinder for its directory has been created.
sys.path_importer_cache.clear()

No, neither of those seems especially elegant, but they work.

JamesParrott · January 30, 2026, 1:06pm

Fair enough. There’s more than one way to approach so many problems. Especially in Python (despite The Zen). If you get it working, I’m interested to hear how you did it

BrenBarn · January 31, 2026, 7:27am

Can you not just create your own FooFinder and add it to sys.path_hooks? I don’t know what kind of loader you would return but it seems like what you’re describing would involve first defining a new finder.

iFreilicht · February 14, 2026, 11:21am

As others have pointed out, you need to implement a MetaPathFinder. I assume that you’ll have .foo-files inside of regular python modules and that it contains regular python code as a starting point, so this is quite simple to start with:

import sys
from importlib.abc import MetaPathFinder
from importlib.machinery import ModuleSpec, SourceFileLoader
from pathlib import Path
from types import CodeType, ModuleType
from typing import Sequence

EXTENSION = ".foo"


class CodeGenLoader(SourceFileLoader):
    """Entry Point for code-generation"""

    def get_code(self, fullname: str) -> CodeType:
        """Override to transform source before compilation."""

        source_path = self.get_filename(fullname)
        print(f"Compiling {source_path}")

        source_bytes = self.get_data(source_path)
        source_text = source_bytes.decode("utf-8")

        return compile(source_text, source_path, "exec", dont_inherit=True)


class MyMetaPathFinder(MetaPathFinder):
    """Finder for custom files"""

    def find_spec(
        self,
        fullname: str,
        path: Sequence[str] | None,
        target: ModuleType | None = None,
    ) -> ModuleSpec | None:
        print(f"Searching Module {fullname} in path {path}")

        module_name = fullname.split(".")[-1]
        if not path:
            # Top-level imports are not implemented! We just assume
            # the file is placed in some valid python module
            return None

        for candidate in path:
            module_file = (Path(candidate) / module_name).with_suffix(EXTENSION)
            if module_file.exists():
                break
        else:
            # No file with our custom file extension matched
            return None

        # Return a custom loader that allows us to modify the
        # code in the module before compiling and executing it.
        loader = CodeGenLoader(fullname, str(module_file))
        spec = ModuleSpec(
            name=fullname,
            loader=loader,
            origin=str(module_file),
        )
        spec.has_location = True
        return spec


# Register the path hook with the module system
for index, finder in enumerate(sys.meta_path):
    # Replace the hook if its already registered to play nice with
    # module reloading (importlib.reload or `%autoreload 2` in jupyter)
    if isinstance(finder, MyMetaPathFinder):
        sys.meta_path[index] = MyMetaPathFinder()
        break
else:
    sys.meta_path.append(MyMetaPathFinder())

If you store this code in a file called importer.py and have a directory shared containing an __init__.py and myfile.foo somewhere in your PYTHONPATH, you can use it like this:

>>> import importer  # Enables the MetaPathFinder
>>> from shared import myfile
Searching Module shared.myfile in path ['<redacted>/repos/experiment_2026-02-14/shared']
Compiling <redacted>/repos/experiment_2026-02-14/shared/myfile.foo
>>> myfile.__file__
'<redacted>/repos/experiment_2026-02-14/shared/myfile.foo'

Of course, you probably want to do something with the code before compiling it, so I already included a CodeGenLoader in the example above.

Hope this helps!

crj · February 17, 2026, 10:20am

(There is further discussion over in the proposal I made to fix this.)

Alas, there are several wrinkles to sort out with your code:

For efficiency, it needs a cache equivalent to sys.path_importer_cache
You need to take into account case-insensitivity of extensions under Windows
As you note in your comments, it doesn’t work for top-level imports

Over in that other thread, Paul Moore mentioned Quixote as a project which implements its own file extensions; it turns out it achieved that relatively heinously by overriding an undocumented function that began with an _.

In general, none of these solutions look especially simple or robust. Whereas I had a go at implementing my proposed enhancement to importlib and the volume of code changes needed was surprisingly small: about a third the size of your solution.