PEP 713: Callable Modules

pf_moore · April 24, 2023, 9:04am

… but if ModuleType has a __call__ method, then every module shows as callable, even those that don’t have a __call__.

This seems solveable, but as @encukou said, it’s a nasty little detail to get right.

encukou · April 24, 2023, 9:11am

Yes. ModuleType.__call__ is what callable(x) looks at, and it’s very far from trivial to change that.
(Spoiler: you’ll meet CPython’s tp_call slot as a miniboss in this rabbit hole!)

jeanas · April 24, 2023, 11:05am

What if I want the static method to be overridable by subclasses?

What if it is actually a class method?

My point is that you can turn a function into a class without breaking your users, but you cannot turn a module into a class, so by using __call__ you reduce your future design possibilities.

pitrou · April 24, 2023, 11:23am

How does documentation work with this proposal? Let’s take the pprint example. Assuming the pprint module becomes callable, yet exposes other APIs besides its primary calling point, what does help(pprint) do?

Like others I think this would create more confusion than clarity.

pf_moore · April 24, 2023, 11:51am

Thanks. I guess I’m overall neutral on this. I doubt I’d find much use for it myself, but others seem to like it. Issues like callable() need sorting out, and as some people have pointed out, it can (slightly) restrict your future design options compared to exporting a single function/class. But it’s a tool people can use, and like any tool if it doesn’t suit your needs, don’t use it.

The PEP might be more persuasive with a few more examples, but don’t worry about it on my account - I’ve read the examples here and formed my view.

I assume, by the way, that it’s obvious to everyone that deprecating (for example) from glob import glob in favour of import glob is a bad idea? Add the new approach if you want, but there’s no practical value to breaking all existing uses of the module. And there’s lots of people maintaining code that has to support multiple versions - we don’t want to force them to write something like (untested)

with warnings.catch_warnings():
    # Suppress the deprecation warning - does this need to be more precise?
    warnings.simplefilter("ignore")
    try:
        # Use the old approach if it's still available
        from glob import glob
    except ImportError:
        import glob

# Phew! That was way too hard...
files = glob("*.py")

Melendowski · April 24, 2023, 12:14pm

At a first glance, I feel excited about this.

If I’m understanding it correctly, would this help alleviate the boilerplate on .py files that one would like to treat as both a script and a module, so that the script functionality can be imported and used else where?

malemburg · April 24, 2023, 12:53pm

I tend to agree that the feature would likely cause more confusion than do good.

It is often not obvious what calling a particular module would actually do and such uncertainty usually results in hard to find errors, e.g. you forget to add the API name and accidentally call the module object instead of the API you really want to call inside the module. Python would happily accept this, but your application could potentially exhibit unexpected behavior.

I also wonder how introspection would work on such modules:

Would the IDE suggest calling the module or show a list of possible end-points in the module, or both ?
How would APIs working with callables learn that modules are now callable as well, but their main use remains to be a repository of APIs and symbols rather than a single callable entry point ?

I had experimented with making modules real class instances (with all the associated features, including making them callable) in Python in the early 2000s and used this to implement lazy imports. While the logic worked well and Python was indeed capable of handling module classes without changes or major problems, the idea never really took off.

I later replaced the logic with a direct implementation of lazy modules, not relying on custom importers and used that in e.g. mxDateTime.

ntessore · April 24, 2023, 1:39pm

I like this proposal, because I think it would indeed be useful for micropackages that expose a single function.

But if successive PEPs add __getattr__, __call__, __setattr__, and maybe more, could it make sense to introduce some sort of annotation that simply marks an individual module as “class-like”, so that everything is covered in one go?

steve.dower · April 24, 2023, 1:44pm

Call me +0, I guess.

I’d be happy enough to not mess up import datetime vs from datetime import datetime again, but chances are I’d just mess up an isinstance(x, datetime) later on and wouldn’t be much happier. (Arguably I shouldn’t do the isinstance, but I seem to be serializing/deserializing often enough that it’s usually justified.)

Documentation I’m not concerned about - I assume API designers care about their users and will write the documentation they need, so help(module) still just returns module.__doc__.

However, we probably need a clarification on style, specifically, capitalisation, but also verbiage. Mainly because the style-enforcers I’m concerned about are the ones who blindly follow rules, and if we don’t give them rules then they’ll invent them and try to impose them anyway. Module naming tends to follow different rules from functions and classes, which means callers are likely to know they’re calling a module, when really they should only be calling a callable without being concerned as to its type.

For example, if we did this to a thread module,^[1] would thread(...) be calling start_new_thread() or instantiating Thread, and why doesn’t the name thread give me any hints? Is this just a case where I shouldn’t consider making the module callable? Or should I consider renaming it? Or is it going to be considered Good Design going forward for all-lowercase nouns to start something?

(Note that I’m not suggesting PEP 8 updates, except as necessary for stdlib implementers. I’m suggesting a section in here that provides some guidance on how to name callable modules, which I hope will look like “name them like regular modules unless you need to name them differently”.)

Technically, I’m really not concerned. This should be straightforward enough, and introspection looking for “callable” before “module” will see it as callable and will need to adapt.

There are tricks you can do to make import foo “return” an instance of your own class rather than a true module (with only the default importers running), but it’s a bit rough. I believe the module __getattr__ came out of discussions to make it easier, as it was the only case found to be important enough to justify that level of metaprogramming.

Doesn’t really exist, but blend threading and _thread in your mind. ↩︎

jeanas · April 24, 2023, 2:03pm

Slightly OT: Do we want to encourage micropackages though? (by @mitsuhiko)

pf_moore · April 24, 2023, 2:15pm

A couple of really simple examples of this are decimal and fractions. The module is called decimal, but the type (class) is called Decimal. And that’s (currently) matching the style guides that say modules should be lowercase and classes capitalised. What’s the “correct” style for a callable decimal module? And fractions is even worse, as there’s the question of singular vs plural.

As I said above, I’m a strong -1 on using this feature in any existing stdlib module, but it’s useful to think about stdlib cases to understand the design concerns the proposal brings up.

bob-white · April 24, 2023, 2:15pm

from pprint import pprint
from types import ModuleType
import sys
class _CallableModule(ModuleType):

	def __call__(self, *args, **kwargs):
		return pprint(*args, **kwargs)

sys.modules[__name__].__class__ = _CallableModule

This can already be done to a certain degree.

rhettinger · April 24, 2023, 3:03pm

Yes, aside from the occasional minor convenience, mostly this will just mess with people’s mental model of Python. Currently, there is a huge and easy to explain difference between import pprint and from pprint import pprint.

This PEP will put code reviewer in the awkward position of having to remember which modules have the call capability and remembering which version of Python that capability was added. For example, when is this code correct, import pprint; pprint(dir()).

Also, the premise that modules have only one principal capability is dubious. A module may start that way but can grow over time.

There is also the matter of spelling. We typically capitalize class names while lower casing function names. This is a problem for cases like the graphlib module that only features TopologicalSorter. We really don’t want instantiation with ts = graphlib(*args). That would appear too much like a function call.

pitrou · April 24, 2023, 3:34pm

No, this is unrelated. “Treating as a script” means running code on import (i.e. import foo automatically executes top-level code from foo). “Making a module callable” means running code when the imported module is called (i.e. import foo merely imports the module, you still have to run foo() to actually call it).

ucodery · April 25, 2023, 2:04am

I think saying that a callable module is going to confuse developers and/or IDEs is disingenuous. No one who has spent any time with the language is confused that a class may contain attributes, even as part of its API, in addition to being callable or that instances of that class also contain instance attributes, as well as retaining access to class attributes, all while potentially being itself again callable.

So far the attention has been on callable top level modules, which I personally believe could be useful, but callable submodules will also be possible with this and I think be an extra tool in any developer’s refactoring tool belt, especially as a single module package begins to outgrow its first file.

Thinking in terms of submodules also shows how using a callable module isn’t really any different than any other API, you have to actually read the documentation. If they had always been part of the language yet you had never before used datetime, what would you think datetime.datetime was? A submodule, class, or maybe factory function? The real answer is probably what you would guess last based on convention. Yet it doesn’t really matter, the docs said to call it and pass these arguments, and that about all you care about as a consumer.

oscarbenjamin · April 25, 2023, 9:38am

It is usually incorrect to suggest that someone is being “disingenuous” in discussions like these even if it does seem that way to you at the time that you say it. It is better to take what others are saying in good faith and try to understand their perspective.

I already see a lot of confusion particularly for beginners around the distinction between a module and an object imported from a module. These two things are sometimes interchangeable but not always. For example in

import a.b.c

each of a, b and c needs to be a package/module. However in

from a.b import c

it is ambiguous whether c is a module or an attribute that is defined in the b module. It is easy to get mixed up about what sort of object you have here especially if the c module contains an object whose name is also c. Personally I think it is unfortunate that this ambiguity in the import statement was allowed especially since there is a clearer alternative for the case where it is a module:

import a.b.c as c

To consider a concrete hypothetical example that has already been mentioned currently you should choose between these two:

# approach 1
import datetime
t = datetime.datetime(2000, 1, 1)

# approach 2
from datetime import datetime
t = datetime(2000, 1, 1)

You need to know whether you are importing the module or importing the name from the module and mixing them up mostly gives a clear error:

import datetime
t = datetime(2000, 1, 1)  # TypeError: 'module' object is not callable

Following the suggested approach here you could make the datetime module callable so that this works. It would still be necessary to choose between import datetime and from datetime import datetime though because the other attributes of the module/class would only be available in one case rather than the other. More confusingly they do have some attributes in common (‘date’, ‘time’, ‘tzinfo’) so it could be easy to mess things up.

This suggestion also presumes that it is clear what callable someone would typically want when using datetime e.g. here I presumed the datetime constructor but in practice I am more likely to use other methods like datetime.now or datetime.fromtimestamp. For those I would still need to do from datetime import datetime or otherwise we are faced with an awkward choice for exactly what callable should be the “default” function of the module (in face of ambiguity…)

Altogether I think that the suggestion of making modules callable with __call__ offers very little benefit and has the potential to introduce confusion by mixing up what should be a clear distinction between modules and the objects found in modules. I see no particularly compelling use case because anything that can be done with this can also be done just by importing a function and calling it which is less prone to confusion. There might be very esoteric cases where it is particularly useful to do this but for those maybe patching sys.modules is good enough.

In a parallel thread this proposal is being conflated with a suggestion to be able to define __setattr__ for modules. I think conflating these two proposals is unfortunate because that other proposal is precisely about preventing users from getting confused and accidentally setting attributes on a module when they should be importing an object from the module and setting attributes on that. The intention is to help users to see when they have mixed up their imports. This proposal on the other hand is precisely aimed at blurring the distinction between modules and attributes of modules which I don’t think is particularly helpful.

ucodery · April 26, 2023, 4:25am

Yes, I should have phrased that better. Existing dev tools like help() or autocomplete or IDE tool tips shouldn’t have any harder job with callable modules than they already do working with any current object that is callable yet also provides attributes, methods, or even indexing at the same time.

This is a good point and is probably why I am personally fine with callable modules, because this ambiguity already exists.

The datetime module is probably a poor example (one that I perpetuated) as there is not only one thing it does. pprint is stronger IMO in that, while it provides a half dozen functions and a class, there is one that is wanted nearly every time. It’s not about packages that already provide a callable under the same name, but packages with one obvious primary task.

njs · April 26, 2023, 5:11pm

This was going to be my comment – the __class__ assignment trick is definitely more obscure, but OTOH it works today plus it allows users to use any dunder or descriptor on modules, not just __call__. This isn’t necessarily a blocker – we still added the special __getattr__ and __setattr__ support despite those already being possible through __class__ assignment. But it changes the question – not “are callable modules worth supporting at all?”, but “do the benefits of making them easier to discover/use justify adding a second way to do things?”. (And it would also be nice if we came out of the discussion with more general principles about which module dunders are worth special-casing and which aren’t.)

oscarbenjamin · April 26, 2023, 6:46pm

Is there __setattr__ support?

There is a parallel thread proposing to add that:

As mentioned there one downside of setting __class__ is that it slows down all attribute access and reading attributes from a module is very common (np.sin(...) etc).

In ordinary Python code modules are namespaces and their interface is only expected to provide attributes. The __getattr__ proposal (PEP 562) has a clear motivation around accessing deprecated attributes. PEP 562 also added __dir__ which is for listing attributes. The __setattr__ proposal in the other thread is motivated by wanting to disallow (or warn about) setting attributes in cases where it could be a likely user error to do so. I don’t see why any special support should be added to encourage defining modules that have unusual features besides attribute access: most other operations should generally be expected to give TypeError.

njs · April 26, 2023, 7:57pm

No, I misremembered

Yeah, that is unfortunate. Maybe we can fix that though? In principle there’s no reason why a no-op subclass has to have slower attribute access. (IIRC it’s because we currently have a special-case fast path for ModuleType, and otherwise we go through the full regular lookup chain. So we’d want some extra cleverness to apply that fast path to ModuleType subclasses that don’t mess with attribute lookup, eg.) And the nice thing is that a pure optimization is much simpler to land than a new public/supported API.