Currently some modules use test.support.check__all__ to validate __all__ matches exactly the public names in a module-specific test. There is also a test that runs across all modules, Lib/test/test___all__.py, and uses exec to do a from {modname} import * to validate for every module that every __all__ entry is importable.
This leaves a gap for modules with __all__ without a check__all__ test where new public named items are added but extending __all__ is missed. It also looks like occasionally a public name is used instead of a private name.
Proposal
I’d like to extend test.test___all__ to catch these by having it run check__all__ on each module with a __all__ it imports.
This would mean when adding a new name to a file one of the following:
If there is no __all__, works just like today
If it’s a private name, works just like today
If it’s a public name either:
a. Add to __all__
b. Add to explicit not_exported
Which hopefully closes the gap. Doing that raises a couple questions though:
Many modules (149) with __all__ currently fail this test, is adding it worthwhile?
How to work towards the right resolution of cases across all modules?
For many modules the number of missing items is small (1-5).
Should private modules (name begins with _) with an __all__ be excluded? (10 modules)
There’s explicit not_exported which needs to be passed in for multiple modules (ex. template in tempfile), should it live in test___all__, the specific module under test, somewhere else?
If / Once implemented could the per-module check__all__ tests be removed?
You can inspect a specific module with the following snippet:
import importlib
from unittest import TestCase
from test.support import check__all__
def check_module(modname, *, not_exported=()):
not_exported = not_exported or []
testcase = TestCase()
testcase.maxDiff = None
# Make explicitly listed privates members pass check.
names = {}
exec(f"from {modname} import *", names)
names.pop('__builtins__', None)
names.pop('__annotations__', None)
names.pop('__warningregistry__', None)
if '__all__' not in dir(sys.modules[modname]):
print(f"{modname=} does not use __all__")
return
with testcase.subTest(module=modname):
check__all__(testcase, sys.modules[modname], extra=names,
not_exported=not_exported)
print(f"{modname=} __all__ looks good!")
# Fails, PathLike and process_cpu_count should likely be exported.
# Others added to not_exported
check_module('os')
# Explicit non-exported public-looking member.
check_module('tempfile', not_exported=['template'])
Centralising the test logic with a registry of “expected semi public names” seems reasonable to me.
For starting points:
+1 for recording the discrepancies to prevent new occurrences without having to decide on everything immediately
+1 for skipping private modules by default
I think the registry of discrepancies can live in the central test initially. If it were to be moved out, the most useful location would probably be a TOML data file (comments!) in the test suite rather than elsewhere in the source code
+0 for removing the per module tests (if a module has a PyPI backport, the centralised checks are easier to forget, but backports aren’t typically going to be adding new public names)
I see a lot of overlap in this proposal to find which public names are missing from __all__, with @ryan-duve’s proposal to find which public functions are missing from the docs.
This tries not to define “public” or “private” itself. check__all__ uses as a metric if the name starts with _ but the explicit list of not_exported is supported so that libraries can make their decisions explicit. It also supports explicitly listed items which begin with _. Definitely open to encoding that choice in a format multiple tools can use to simplify development; that was part of my motivation asking where the list(s) should live . My goal here is when a name is added an intentional choice is made whether it’s in __all__ or not.
On the other hand, PEP 387 explicitly says that “if something is not documented at all, it is not automatically considered private.”
IMO, we should document the questionable API – if only to (soft-)deprecate it and tell users how to replace it. And deprecated API should not be added to __all__.
I think you’re interpreting a bit too much into the __all__ module attribute. It’s main purpose is to limit which symbols get imported when using from mod import *, but that import usage in itself is already poor style in many cases.
There is no implication of an API being defined as public via the __all__ attribute. Whether an API is public mainly depends on whether it’s documented and in some cases, also whether it’s in popular use (even if not documented, i.e. the missing documentation is a bug and not a feature ).
That said, having at least most of the public APIs listed in __all__ certainly is desirable.
Would it be okay to add a requirement to explicitly record that design decision?
A number of things that are in popular use and well documented, like os.PathLike, are not included in __all__. This proposal doesn’t mean it has to be added to __all__; there’s three options (Add to __all__, use a _ prefixed name, or add to explicit not_exported list). If not exporting is intentional, the module can just add to its not_exported list (format TBD).
One of the biggest things I like about using @public in my own code[1] is that it’s a clear indication of what names are public[2]. Plus, it eliminates spooky-action-at-a-distance by putting the decorator right next to the definition of the public thing. Keeping __all__ filled is an added bonus, but not the most important thing, IMHO.
I don’t believe this statement is true. The more we put in __all__ the more it encourages a code style anti-pattern of using a from beans.spam import * style import.
In general I’m in favor of not adding new things to __all__ in a module unless that module is explicitly intended to be used with a * import.
Eh… I’d say the more we put in a module the more it becomes an anti-pattern. Maintaining __all__ isn’t the issue.
That said, I agree it should be reserved for sensible modules to import * from (e.g. I most often do it with urllib.request, and would never do it for asyncio or typing).