Testing for public globals missing from __all__

Background

Currently some modules use test.support.check__all__ to validate __all__ matches exactly the public names in a module-specific test. There is also a test that runs across all modules, Lib/test/test___all__.py, and uses exec to do a from {modname} import * to validate for every module that every __all__ entry is importable.

This leaves a gap for modules with __all__ without a check__all__ test where new public named items are added but extending __all__ is missed. It also looks like occasionally a public name is used instead of a private name.

Proposal

I’d like to extend test.test___all__ to catch these by having it run check__all__ on each module with a __all__ it imports.

This would mean when adding a new name to a file one of the following:

  1. If there is no __all__, works just like today
  2. If it’s a private name, works just like today
  3. If it’s a public name either:
    a. Add to __all__
    b. Add to explicit not_exported

Which hopefully closes the gap. Doing that raises a couple questions though:

  1. Many modules (149) with __all__ currently fail this test, is adding it worthwhile?
    1. How to work towards the right resolution of cases across all modules?
      1. Can this follow the model of gh-101100 “known sphinx ref issues” and allow no new cases + have a todo list effectively.
      2. For many modules the number of missing items is small (1-5).
    2. Should private modules (name begins with _) with an __all__ be excluded? (10 modules)
  2. There’s explicit not_exported which needs to be passed in for multiple modules (ex. template in tempfile), should it live in test___all__, the specific module under test, somewhere else?
  3. If / Once implemented could the per-module check__all__ tests be removed?

You can inspect a specific module with the following snippet:

import importlib
from unittest import TestCase
from test.support import check__all__

def check_module(modname, *, not_exported=()):
    not_exported = not_exported or []
    testcase = TestCase()
    testcase.maxDiff = None
    # Make explicitly listed privates members pass check.
    names = {}
    exec(f"from {modname} import *", names)
    names.pop('__builtins__', None)
    names.pop('__annotations__', None)
    names.pop('__warningregistry__', None)
    if '__all__' not in dir(sys.modules[modname]):
        print(f"{modname=} does not use __all__")
        return
    with testcase.subTest(module=modname):
        check__all__(testcase, sys.modules[modname], extra=names,
                    not_exported=not_exported)
    print(f"{modname=} __all__ looks good!")

# Fails, PathLike and process_cpu_count should likely be exported.
# Others added to not_exported
check_module('os')

# Explicit non-exported public-looking member.
check_module('tempfile', not_exported=['template'])

Related Proposals

edit: Added related proposals section; added background and proposal headings to improve navigability

5 Likes

Centralising the test logic with a registry of “expected semi public names” seems reasonable to me.

For starting points:

  • +1 for recording the discrepancies to prevent new occurrences without having to decide on everything immediately
  • +1 for skipping private modules by default
  • I think the registry of discrepancies can live in the central test initially. If it were to be moved out, the most useful location would probably be a TOML data file (comments!) in the test suite rather than elsewhere in the source code
  • +0 for removing the per module tests (if a module has a PyPI backport, the centralised checks are easier to forget, but backports aren’t typically going to be adding new public names)
5 Likes

Hey @thomas, is it time to revisit adding @public to built-ins? :winking_face_with_tongue:

3 Likes

How do you define public and private?

I see a lot of overlap in this proposal to find which public names are missing from __all__, with @ryan-duve’s proposal to find which public functions are missing from the docs.

@guido wrote there:

And I think the same essentially applies to __all__.

If we had some sort of list of “public-looking names but not actually public”, that could be used by both proposals.

2 Likes

This tries not to define “public” or “private” itself. check__all__ uses as a metric if the name starts with _ but the explicit list of not_exported is supported so that libraries can make their decisions explicit. It also supports explicitly listed items which begin with _. Definitely open to encoding that choice in a format multiple tools can use to simplify development; that was part of my motivation asking where the list(s) should live :slight_smile:. My goal here is when a name is added an intentional choice is made whether it’s in __all__ or not.

That goal came out of triaging In `locale` module, add missing functions to `__all__` · Issue #140924 · python/cpython · GitHub where both functions are documented. Looking at that I noticed similar issues over time (ex gh-121300, gh-111356, …). That made me wonder if a more systematic resolution specifically for __all__ was possible.

1 Like

On the other hand, PEP 387 explicitly says that “if something is not documented at all, it is not automatically considered private.”

IMO, we should document the questionable API – if only to (soft-)deprecate it and tell users how to replace it. And deprecated API should not be added to __all__.

4 Likes

Aye, one of the benefits I see to this is that it should help to identify those existing gaps where decisions need to be made.

4 Likes

Perhaps some proposal like this would help to implement that too?

I feel like these posts should at least link to each other, the topics are quite similar imo.

1 Like

Added a “Related Proposals” section

1 Like

I think you’re interpreting a bit too much into the __all__ module attribute. It’s main purpose is to limit which symbols get imported when using from mod import *, but that import usage in itself is already poor style in many cases.

There is no implication of an API being defined as public via the __all__ attribute. Whether an API is public mainly depends on whether it’s documented and in some cases, also whether it’s in popular use (even if not documented, i.e. the missing documentation is a bug and not a feature :smile:).

That said, having at least most of the public APIs listed in __all__ certainly is desirable.

5 Likes

Would it be okay to add a requirement to explicitly record that design decision?

A number of things that are in popular use and well documented, like os.PathLike, are not included in __all__. This proposal doesn’t mean it has to be added to __all__; there’s three options (Add to __all__, use a _ prefixed name, or add to explicit not_exported list). If not exporting is intentional, the module can just add to its not_exported list (format TBD).

One of the biggest things I like about using @public in my own code[1] is that it’s a clear indication of what names are public[2]. Plus, it eliminates spooky-action-at-a-distance by putting the decorator right next to the definition of the public thing. Keeping __all__ filled is an added bonus, but not the most important thing, IMHO.


  1. y’all can manage your own __all__'s ↩︎

  2. and there is an @private ↩︎

2 Likes

I don’t believe this statement is true. The more we put in __all__ the more it encourages a code style anti-pattern of using a from beans.spam import * style import.

In general I’m in favor of not adding new things to __all__ in a module unless that module is explicitly intended to be used with a * import.

2 Likes

I’m not worried, since most linters complain about that pattern in code. from-import-* is pretty handy in the interactive interpreter.

2 Likes

Eh… I’d say the more we put in a module the more it becomes an anti-pattern. Maintaining __all__ isn’t the issue.

That said, I agree it should be reserved for sensible modules to import * from (e.g. I most often do it with urllib.request, and would never do it for asyncio or typing).

5 Likes