Customizing the builtins module

mrolle45 · May 24, 2022, 1:26am

A message from Eric Traut mentioned that pyright works with custom Python distros to discover names that have been added to builtins. I’m not sure how these distros specify added names. But it got me to thinking…
Could standard Python come with some way of either adding names to builtins or even replacing it altogether.
A good use case would be that in my project, I want to see certain names visible to all the modules in the project. At present, I would have to define these in some other module which is imported into each module in the project, so that those names are part of the module globals. Having them as part of builtins would expose them as globals (if not also set as globals in the module), It would have the same effect, but without having to write special code in every module.
These names would also be seen by all library modules, which is unfortunate, but I don’t think that would be an issue in most circumstances. Any use of a name, if otherwise undefined, would succeed, but normally it would raise a NameError. If the code actually tests for the presence of the name in the module, that could be a problem.
The interpreter could also modify its global name lookup so that it looks at either the standard builtins or the alternate builtins based on what the current module is.

Here’s a simple idea, similar to what pyright does.

When importing a user module, look for a file named __builtins__.py “somewhere”. “somewhere” is yet to be sorted out; it could just be in the project root directory and/or the directory containing the imported file.
If this is found, then import it and put it in sys.modules under its full name.
Add a name __builtins2__ to the imported module object, a reference to the __builtins__ module.
If you want to get really fancy, there could be many __builtins__.py modules in the directory tree. In that case, __builtins2__ is the innermost module. And since that module has another __builtins__.py, it will have a builtins2 attribute of its own.
The interpreter, when looking for a name in a global context (such as a LOAD_GLOBAL bytecode) will look in the globals and the standard builtins namespaces as usual, but if that fails, look also in the __builtins2__ module. If that fails, look in __builtins2__.__builtins2__ (if that exists), and so on.

pf_moore · May 24, 2022, 7:57am

I’m hesitant to even point this out, because modifying builtins is a really bad idea, but you can do this already just by modifying the existing __builtins__ object:

>>> __builtins__.foo = 12
>>> foo
12

This is noted under the documentation for the builtins module, but is explicitly noted as an implementation detail.

Honestly, though, you should just import the names when you need them - the problems with modifying builtins aren’t worth the gain of avoiding a few imports.

malemburg · May 24, 2022, 9:03am

Agreed.

I had been adding new builtins to Python via the mxTools package many years ago. It seemed like a good idea to make those builtins available to all code in a project (importing mx.Tools once would register the builtins).

But when I started using this approach, it quickly became apparent that I lose control over where those new builtins are used, making it difficult to track package dependencies.

Since then I always imported the functions using regular imports in each of the modules using them, which resolved the problem.

steven.daprano · May 24, 2022, 9:34am

The __builtins__ dunder is the implementation detail, not the existence of builtins at all.

Using the dunder name __builtins__ is not portable, but the official way to get access to the builtins module by importing it should be:

import builtins  # no underscores

And now you have a module that you can monkey-patch like any other module. (There’s just no .py file involved – the builtins module is built in to the interpreter.)

Beware that over-writing existing names is a really bad idea:

builtins.len = "Surprise!"

and will cause havoc. You don’t want to do that. (But note that just quiting the interpreter and restarting will reverse all the changes.)

Adding new names to the builtins should be safe:

builtins.myfunc = myfunc

providing you only do it as part of your own application (or in the interactive interpreter), i.e. in situations where you control the environment.

If you don’t control the environment (i.e. in a library) then this sort of monkey-patching is not quite as dangerous as what the Ruby community does but its still rather risky.

But honestly, this is more of a neat trick than a serious technique. It makes your application more fragile and harder to debug problems, by combining all the disadvantages of from module import * with the disadvantages of a single application-global namespace. I don’t recommend it for anything but smallish applications.

pf_moore · May 24, 2022, 12:54pm

I knew it was __builtins__ that was the implementation detail, but it hadn’t occurred to be to check if I could patch builtins directly. So yeah, you can do that. But as I said, please don’t, except in the privacy of your own application if you must

barry · May 24, 2022, 4:29pm

I do something similar in the @public package, though not by default. I can be both convenient and mysterious but ultimately I think it’s better to explicitly import the symbols because ensuring that builtins is monkeypatched at the right time is tricky given all the ways import order can be futzed with.

mrolle45 · May 24, 2022, 8:16pm

I figured that monkey patching builtins was tricky. However, can it be done?

zware · May 24, 2022, 9:02pm

It can be done. Please don’t.

steven.daprano · May 25, 2022, 12:19am

Michael Rolle said:

“I figured that monkey patching builtins was tricky. However, can it be done?”

We’ve answered this in this thread. Perhaps re-read it carefully?

It’s not difficult, builtins are just a module like any other.
You need to set up your monkey-patching as part of your application’s initialization. That may work against modularisation.
Use import builtins to be portable.
Adding new names, e.g. builtins.myfunc = myfunc, should be safe enough.
Do not remove or replace anything, you will break things.
But we really don’t recommend this as a technique, for reasons already discussed.

This monkey-patching of builtins may seem convenient, but for small applications the convenience factor is outweighed by the setup costs. And for large applications, the convenience is far outweighed by the costs to maintainability, debugging. The larger the application, the worse it gets.

There may be a “Goldilocks” spot, not too small, not too big, just the right side where the convenience factor wins out, but even there, we don’t recommend it.

Topic		Replies	Views
Add a module_names attribute to importlib.metadata.Distribution Ideas	2	569	July 17, 2023
Guidance creating custom python importer Python Help	5	350	January 21, 2024
Why is there no `builtins` package in PyPi? Python Help	6	384	May 23, 2023
Adding a global config to specify package indexes Packaging	33	6093	June 22, 2021
Moving all stdlib packages into wheels Ideas	16	1820	December 26, 2019

Customizing the builtins module

Related Topics