Add `umask` to `contextlib`

Thanks! And yes, please go ahead and open a PR for that branch.

Yes, a new thread for that would be good, as we need to decide between the different options for implementing the aliasing.

One of the hidden secrets of the contextlib module is that a protocol along these lines kind of exists in order to make contextmanager CMs usable as function decorators: function decorated with a context manager can only be invoked once · Issue #55856 · python/cpython · GitHub

The implicit CM refresh happens when called rather than through __aenter__, though.

Trying to generalise the way the chdir CM works to arbitrary CMs would likely be feasible, but I don’t think the problem comes up often enough to be worth worrying about since:

  • natively implemented CMs will always be faster than generator based ones, since they don’t have to spend time building and invoking the generator (Edit: for simple cases. For sufficiently complex state, getting to use optimised local variables instead of instance variables as storage can make the trade-offs more complicated. That performance balance is more likely to tip for iterators than it is for CMs, though)
  • once you have switched to writing a native CM, pushing/popping values in a list is just as simple to implement as setting/clearing an instance attribute
  • there are some CMs where trying to make them reentrant (or even reusable) is a genuinely incoherent request (mostly external resources where once they’re closed they’re released and not accessible anymore), so a general tool that isn’t actually fully general is likely to create more confusion than it resolves
2 Likes

Okay the PR is now open here!

I’ll open another thread soon for ‘Move contextlib.chdir to become shutil.chdir

2 Likes

Thread discussing chdir being moved from contextlib to shutil is here

1 Like

What citation do you need? You can cite me if you wish.

The wast mayority of user written context managers are implemented via generator functions. They are not reusable by their nature. Even these that are implemented as classes are usually not reusable. Even its neibours in contextlib, redirect_stdout and redirect_stderr are not reusable. You need to find a lot of reusable context managers and show that they are majority to challenge my statement.

Don’t use inheritance, use composition. For example, the following wrapper

class reusable:
    def __init__(self, func, /, *args, **kwargs
        self._func = func
        self._args = args
        self._kwargs = kwargs
        self._stack = []
    def __enter__(self):
        cm = self._func(*self._args, **self._kwargs)
        ret = cm.__enter__()
        self._stack.append(cm)
        return ret
    def __exit__(self, *exc_info):
        cm = self._stack.pop()
        return cm.__exit__(*exc_info)

converts non-reusable context manager to reusable. You can apply it to generator-based chdir(), umask(), whatever.

My point is that the problem comes up not often enough even for chdir(). There is a similar context manager in test.support, it is not reusable, and nobody asked to make it reusable. There are many other context managers in test.support and the stdlib which are not reusable. The generator-based implementation is simply the simplest (one function vs a class with at least three methods). If anybody needs a resusable CM in case when the CM can be made reusable, it is better to provide a simple wrapper (see above) than rewrite any existing CM.

1 Like

It depends on the environment. Since I’m the sysadmin for ours, I know the Linux user’s primary group is more restrictive than 4 (adm). In any event, _adm_open could set the mode to 0o600 on open and then change it later.

These CMs aren’t particularly complicated no matter which way you write them:

class chdir:
    """Non thread-safe context manager to change the current working directory."""

    def __init__(self, path):
        self.path = path
        self._old_cwd = []

    def __enter__(self):
        self._old_cwd.append(os.getcwd())
        os.chdir(self.path)

    def __exit__(self, *excinfo):
        os.chdir(self._old_cwd.pop())
class chdir:
    """Non thread-safe context manager to change the current working directory."""

    def __init__(self, path):
        self.path = path
        self._old_cwd = None

    def __enter__(self):
        if self._old_cwd is not None:
            raise RuntimeError("chdir context manager is not re-entrant")
        self._old_cwd = os.getcwd()
        os.chdir(self.path)

    def __exit__(self, *excinfo):
        os.chdir(self._old_cwd)
        self._old_cwd = None
@contextmanager
def chdir(path):
    """Non thread-safe context manager to change the current working directory."""
    old_cwd = os.getcwd()
    os.chdir(path)
    try:
        yield
    finally:
        os.chdir(old_cwd)

However, the last version is unquestionably the slowest of the three, and the state to be managed is simple enough that switching to one of the class based versions is straightforward.

And once you are using one of the class based approaches, it’s easier to just make the CM re-entrant than it is to prevent re-entry attempts from causing weird problems.

The expectations for the one in test.support are different, since test.support code only needs to meet the bar of “easy enough for a core developer to use when writing tests”, rather than the higher bar of “hard for even a complete novice to misuse” (which is definitely something that contextlib aspires to, even if we haven’t always succeeded).

I assume the actual history for chdir is that it just copied the redirect_stdout implementation, which I made re-entrant way back in Close #19403: make contextlib.redirect_stdout reentrant · python/cpython@8e113b4 · GitHub based on Issue 19403: Make contextlib.redirect_stdout reentrant - Python tracker. We never even considered the generator based approach for that one - Raymond added the initial class based version in Issue #15805: Add contextlib.redirect_stdout() · python/cpython@088cbf2 · GitHub based on the discussion in Issue 15805: Add stdout redirection tool to contextlib - Python tracker.

Interestingly, that bpo-15805 considered more potential locations than I remembered. I’ll add a note about that to the other thread regarding potentially moving the existing process state manipulation APIs.

Edit: something to keep in mind is that while @contextmanager is simple to use, the resulting context managers are far from simple for the interpreter to execute at runtime. Counting comments and blank lines, _GeneratorContextManager and _GeneratorContextManagerBase are around 100 lines of Python code. While that does include copious comments, the nature of the code is such that it needs those comments (one of the comments also reminded me that Simplify per-instance control of help() output · Issue #63603 · python/cpython · GitHub messes up the interactive help output for generator CM instances).

2 Likes

Apart from documentation tweaks, the main change on the GitHub PR is that we’re now considering a different name than just the bare umask, lest it collide with os.umask, and to stand out as a context manager.

I propose the idiomatic name of umask_of

with umask_of(0o077):
    do_private_stuff()

Looks Pythonic.

One other suggestion was temp_umask, however it sounded too similar to the folder /tmp. In particular if chdir is to follow in a similar way then temp_chdir sounds like it would chdir into /tmp. Also it has been pointed out that temp_umask sounds like a temporary local variable.

1 Like

@jb2170 The dangling preposition in umask_of bothers me. I don’t hate it (since it does read OK once the parameter is present in an actual invocation), but I don’t love it either.

How would you feel about using ensure_* as a prefix convention instead?

with ensure_umask(0o077):
    do_private_stuff()

“Ensure” is a nice verb for this kind of use case, since it allows for the CM to make itself a no-op if there’s an efficient way for it to detect that the requested state is already the current state.

It does mean that the contextlib.chdir replacement would be shutil.ensure_cwd, but I’m against programmatic deprecation of contextlib.chdir in the first place (I’ll post more about that in the other thread).

2 Likes

For me, the _of suffix just doesn’t communicate anything. It succeeds in the goal of making the name different to umask [1] but for all it tells you about why it’s different, you might as well pick something like umask2.


  1. not a goal I particularly agree with but hey ho ↩︎

5 Likes

How about synonyms of “temp”, e.g. fleetingly_umask() or momentarily_umask()?
Or use the Linux shell convention (and also Windows nowadays) for temporary cwd, pushd and popd, so push_pop_umask()?

Agreed, umask_of doesn’t carry additional information.

For that reason, why not call it umask_context?

from shutil import umask_context 

... # many lines of code here 

with umask_context(0o007):
    save_secret()

By that same reasoning, I think ensure_umask is okay, but not great. If it doesn’t end up using a different mechanism from os.umask, I’m not sure that ensure is all that meaningful.


Given that it’s in a different module from os, I’m not sure I’m really convinced that it needs a different name. But if it is being renamed, I’d like to help give it the best name possible.

5 Likes

IMHO for consistency it’s better to not change it, since also the chdir context manager have the same problem.

1 Like

umask_context sounds good! I’d be in favour of that. I think we’ve I’ve overthought it, and that’s the simplest candidate name lol. It does what it says on the tin.

ensure_umask sounds too much like an assertion function: def ensure_umask(mask): assert get_current_umask() == mask so I’m not sure that’s the best.

And yeah I agree umask_of only makes sense within its with umask_of(mask) context. Just reading it in the __all__ of shutil doesn’t immediately explain it.

I’ll change to umask_context.

4 Likes

PR updated

I’m happy enough to go ahead with the *_context suffix convention. While I prefer names to capture intention rather than describe aspects of the implementation, I also agree that ensure_* has some ambiguity between asserting something is already true and actually temporarily making it true.

If we come up with a naming convention we like better before May (beta API freeze) we can still change it.

That said, I’d love to come up with an actual name for these kinds of apply-and-revert context managers, especially if we can make the reusable-and-reentrant pattern for implementing them easier to use.

Framing the naming problem that way inspired the following notion of “modification contexts” (which in turn leads to proposing modify_{NOUN} as the general naming convention when there’s no more specific verb that applies, like replace_{NOUN} or redirect_{NOUN}):

class ModificationContext(ContextDecorator, AbstractContextManager):
    """Context manager to change a value on entry and revert it on exit."""
    # Since these are inherently reusable-and-reentrant context managers,
    # they're compatible with being used as function decorators
    # in addition to being usable as regular context managers.

    def __init__(self, value):
        self._applied_value = value
        self._values = []

    @property
    def applied_value(self):
        return self._applied_value

    def __repr__(self):
        return f"{type(self).__qualname__}({self._applied_value!r})"

    def apply(self):
        self._values.append(self._apply())
        return self._applied_value

    def revert(self):
        self._revert(self._values.pop())

    @abstractmethod
    def _apply(self):
        # Actual implementation of applying the change and
        # reporting the value to be restored
        raise NotImplementedError

    @abstractmethod
    def _revert(self, previous_value):
        # Separate reversion method is given the value to be restored
        raise NotImplementedError

    def __enter__(self):
        return self.apply()

    def __exit__(self, *exc_info):
        self.revert()

With the following subclass in contextlib (no changes needed to the public redirect_* subclasses):

class _RedirectStream(ModificationContext):

    _stream = None

    def _apply(self):
        previous_stream = getattr(sys, self._stream))
        setattr(sys, self._stream, self._applied_value)
        return previous_stream

    def _revert(self, previous_stream):
        setattr(sys, self._stream, previous_stream)

And these in shutil:

class modify_cwd(ModificationContext):

    @property
    def path(self):
        return self._applied_value

    def _apply(self):
        previous_cwd = os.getcwd()
        os.chdir(self._applied_value)
        return previous_cwd

    _revert = staticmethod(os.chdir)

class modify_umask(ModificationContext):
    # Convention: when the same word is used for both the noun and verb forms,
    # use it as the value attribute and use `_apply_{NOUN}` as the method name

    @property
    def umask(self):
        return self._applied_value

    def _apply(self):
        return os.umask(self._applied_value)

    _revert = staticmethod(os.umask)

We could potentially even add the following generalisation of the
stream redirection modification to contextlib:

class replace_attribute(ModificationContext):
    # Similar in concept to `unittest.mock.patch`,
    # just without all the testing related features

    def __init__(self, target, attr, value):
        self._target = target
        self._attr = attr
        super().__init__(value)

    def _apply(self):
        previous_value = getattr(self._target, self._attr)
        setattr(self._target, self._attr, self._applied_value)
        return previous_value

    def _revert(self, previous_value):
        setattr(self._target, self._attr, previous_value)

Edit: to be clear, the only part of this idea that directly affects the umask CM proposal is calling it modify_umask instead of umask_context. (and maybe making the public access to umask be via a read-only property)

4 Likes

I like the abstraction. It generalizes the problem nicely and makes all of the implementations naturally reentrant.

100% agree. I like names that tell us what things are whenever possible.

modify_* is a tiny bit generic for my taste. I still prefer *_context over that because it tells us that it’s a context manager.

In trying to think of a word which doesn’t just mean “set it”, but also inherently carries a meaning or connotation of “later unset it”, the best idea I’ve had so far is toggle_*. These would all be “toggles” and I think it lines up well because to “toggle off”, the implementation has to be a context manager.

“Toggle” came from imagining a switchboard. The other analogy I thought about are self regulating systems, but I think negatively_self_reinforcing_umask wouldn’t be popular. :stuck_out_tongue_closed_eyes:

1 Like

Revert[i|a]bleStateContext?

Toggle implies binary states to me, which isn’t what is happening here. (I know the intent is to mean “definitely this value” and “potentially some other value” as the two states, I just think that’s seriously stretching the meaning of the word).

temp_* and modify_* are both abbreviations of the too-long-to-be-usable name temporarily_modify_*, but the modify_* variant keeps the actual verb from the verb phrase, and avoids any potential connotations of being related to the tempfile module.

Edit: while I did mention it as a consideration earlier, I’m not too worried about having “and change it back” be explicit in the CM names. It is common for CMs to only describe their entry action (leaving their exit action as implicitly undoing the entry action), and reviewing the stream redirection CMs reminded me of that fact.

Based on your ModificationContext abstract class I was thinking ‘are there any more classes like umask that we might want to add?’. I mentioned seteuid and friends earlier, in particular the one that makes sense as a context manager is seteuid: seteuid sets the effective user id, with which files are created / owned, and against which filesystem access checks are done. It reminded me of sudo and su. Some people read ‘su’ as ‘switch user’, or ‘super user’, but looking at man 1 su: “su allows commands to be run with a substitute user and group ID”

  • substitute_{NOUN}

  • substitute_cwd

  • substitute_umask

  • substitute_euid (would need a separate thread to discuss)

That seems like a good verb, with a connotation of transience. It can be read as both an imperative ‘substitute the umask!’,

and also as an adjective: ‘this is a substitute umask’. I hope this double edged property of ‘substitute!’ (imp) and ‘substitute’ (adj) is useful, and doesn’t create confusion pulling in both directions.

The _context suffix is sort of implied by virtue of a class having __{enter,exit}__ methods, and removing _context from umask_context brings us right back to plain umask, which we’ve ruled out.

And using substitute means that the class chdir_context would be renamed substitute_cwd. Technically win32 has a chdir syscall / ntdll.dll-call, but Windows / DOS is weird with non-posix-y names (‘chdir is a deprecated alias for the _chdir function’). Python’s os.chdir of course works on Windows, but cwd is a more platform-agnostic, implementation independent concept.

So we’d be continuing a pattern of cwd with that. (sorry to cross-contaminate threads but it’s relevant :slight_smile: )

Oh, and SubstitutionContext may be a name for ModificationContext. All of substitute_{cwd,umask,euid} follow the stack-style implementation. The substitution behavior is fundamentally what separates file-locking context managers, who should do nothing on re-entry (if ever one wanted a re-enterable file-locking context manager which sounds like a deadlock-disaster waiting to happen), from substitution context managers, who save-and-substitute the current state every (re)entry, and restore it on exit.

1 Like

A quick (non-exhaustive) rundown of suggested names, and comments

name comments
contextlib.umask wrong place
shutil.umask bare name umask shadows os.umask, and prevents non-context manager helper functions with the bare name, eg shutil.chown
umask_of dangling _of doesn’t convey meaning outside of with umask_of(mask)
ensure_umask sounds like an assertion, not a context
temp_umask causes confusion with /tmp and tempfiles, especially temp_chdir
umask_context _context makes itself redundant, and removing it leaves a bare name of umask
fleetingly_umask long
momentarily_umask long
temporarily_modify_umask long
toggle_umask incorrectly boolean-sounding
push_pop_umask specifies implementation, but not intention
modify_umask doesn’t imply context / invertibility, and modify_cwd sounds like it could chmod the cwd or something else
substitute_cwd looks good so far!?