Sorry to spam 3 replies in a row but here’s what the substitute_umask etc would look like if they inherited from SubstitutionContext(Manager?). I changed a couple of methods around, and made apply and revertpass-able instead of abstract. Let me know how it looks. I think this is really good.
# contextlib.py
class SubstitutionContext(AbstractContextManager):
"""Context manager to change a value on entry and revert it on exit. Designed to be reentry safe"""
def __init__(self, value):
self._applied_value = value
self._values = []
@property
def applied_value(self):
return self._applied_value
def __repr__(self):
return f"{type(self).__qualname__}({self._applied_value!r})"
def __enter__(self):
self._values.append(self.apply())
return self._applied_value
def __exit__(self, *exc_info):
self.revert(self._values.pop())
def apply(self):
# Actual implementation of applying the change and
# reporting the value to be restored
# raise NotImplementedError
pass
def revert(self, previous_value):
# Separate reversion method is given the value to be restored
# raise NotImplementedError
pass
# shutil.py
class substitute_cwd(SubstitutionContext):
"""Non thread-safe context manager to change the current working directory."""
def apply(self):
old_cwd = os.getcwd()
os.chdir(self.applied_value)
return old_cwd
def revert(self, previous_value):
os.chdir(previous_value)
class substitute_umask(SubstitutionContext):
"""Non thread-safe context manager to change the process's umask."""
def apply(self):
return os.umask(self.applied_value)
def revert(self, previous_value):
os.umask(previous_value)
class substitute_euid(SubstitutionContext):
"""Non thread-safe context manager to change the process's effective user id."""
def apply(self):
old_euid = os.geteuid()
os.seteuid(self.applied_value)
return old_euid
def revert(self, previous_value):
os.seteuid(previous_value)
In my view it is the context of the with statement that carries this information, so there is no need for any explicit affix on the name. with umask(blah) means that the block will be executed with that umask, and the dedent at the end of the block means that’s where the with-ness ends.
If we need a different name in this particular case because of a name conflict, okay, but I don’t think we need to come up with some general naming convention to signal contexts. Contexts are already adequately signaled by the with statement itself.
Without an extra prefix or suffix, __all__ becomes a game of guess-who: ["move", "chown", "umask", "chdir"]: which are functions, and which are context managers?
Also should AbstractSubstituteContextManager be put directly into contextlib.py, or kept in shutil.py until there’s clear use beyond shutil?
You read the documentation to find out. Just like you do even if you know that it’s a function, because you need to know what arguments it takes and what those arguments mean and so on. You can’t use anything without at least glancing at the documentation to see how to use it.
Too bad there isn’t a way to detect if something is being called via a with context. Then in theory the function could share the same name as the context manager, which would be nice for both umask and chdir.
… then again maybe it can be faked via knowing that an __enter__/__exit__ would be called.
Supporting
with os.umask(...):
..
Would be a sweet sugar.
Edit: then again we do this with things like os.open… why not allow the other ones to be contextmanagers too?
We’re not coming up with a naming convention for context managers in general, we’re looking for one for context managers that apply a state change when entered and revert it when exited.
The specific motivation for that search is to allow shutil to distinguish between simple wrappers (like shutil.chown) and modification contexts (like the proposed umask CM) without having that convention be entirely specific to shutil.
If not for that ambiguity problem, shutil.umask would indeed be a reasonable name (and is the one we started with).
As far as ModificationContext goes, I consider the broad nature of “Modification” as a term to be a feature rather than a problem. Similar to “context manager” itself, it isn’t making any claims about what kind of modification is being made, just that some change is being applied and then reverted. The exact choice of verb is then up to the concrete subclasses (such as redirect_stdout, or the hypothetical replace_attribute).
For the process state manipulation CMs, I personally like modify_umask and modify_cwd since the POSIX APIs tend to use words like “change” (as in chdir) and set to describe these operations, rather than words like “replace” or “substitute”.
I’m saying I don’t think we need that either. A lot of context managers do that, and broadly construed almost all of them do (e.g., open changes a context from “this file is not open” to “this file is open”). I think that with umask(...) makes it clear enough that the umask change is scoped to the with block. Now, true, it doesn’t make it 100% clear, and the user should read the documentation to understand, but it’s clear enough.
It’s even possible to have one function do double duty (as @csm10495 noted). umask can be a function that does the change and returns a context manager whose __exit__ undoes it; if it’s not used as a context manager, then the undoing doesn’t happen.
Having gone through the exercise before of coming up with a naming convention for these temporary-value change context managers, I settled on *_being. It connotes state, reads fluently in with blocks (with x_being(y); with umask_being(0o022)), isn’t too long, it’s quite distinct and has a low chance of interfering with any existing conventions.
I don’t like modify_* since the word would be a very common verb prefix for regular functions out in the wild. With using_*, I’m a bit worried that it would use up the word using convention on something as specific as context managers.
Yeah using isn’t a Python keyword but it scratches my head too as though it were one.
with using_foo(): seems like repetition / hendiadys. I’m settled on substitute_{NOUN}, unless there are any objections. And my game-plan has restructured:
PR-1: Add AbstractSubstitutionContextManager to contextlib.py and refactor _RedirectStream and chdir to inherit from it. No compatibility is broken. Its stack structure is useful to all the following PRs.
PR-2: Add shutil.substitute_umask. Use case is established. I’m going to temporarily close the GitHub PR for this one as it will be git rebased post-merge of PR-1.
PR-3: Add shutil.substitute_euid and shutil.substitute_egid. Their use is for scripts that run as root and switch {user|group} id temporarily. This needs a new thread to debate their admission, but most of the heavy debating has been done here.
PR-4: Move contextlib.chdir to shutil.substitute_cwd. The controvertial one, but a deprecation timeline / support has been established. At this point substitute_cwd, substitute_umask, substitute_euid, and substitute_egid all sit next to each other in shutil
Procedurally, I think the shutilumask CM addition (however it is spelled) should happen independent of any contextlib pattern naming and extraction from stdlib examples. I just wanted to consider the latter up front, since it had the potential to inform the shutil naming convention (although we may decide we don’t want to do that, this is just about considering it).
I really don’t like substitute/substitution, as it’s too narrow a word for the concept it’s trying to model (general changes/modifications that need a restoration value tracked), and it’s a near exact synonym for a far more common word pair that could be used instead (replace/replacement). (The same situation exists to some degree with modify/modification vs change, but the latter has many potential meanings that modify doesn’t, including being its own noun form)
Another suffix variant along the same lines as *_of(...) and *_being(...) would be with cwd_set_to(...): and with umask_set_to(...):.
The simplest prefix variant would just be changed: with changed_cwd(...): and with changed_umask(...):. Past tense is because when the CM body is running, the change has already happened.
The mapping from chdir to changed_cwd is pretty straightforward (ch is expanded to changed, and the generic dir is replaced with the more specific cwd).
The downside of pursuing that “acquire the resource in __init__” model is that it means you can’t reasonably make the resulting CM anything other than a one-shot (as with opened files).
Since we’d like to make the new umask CM align behaviourally with contextlib.chdir, it pushes down the path of applying the change in __enter__ rather than in __init__.
I’m -1 for adding umask to contextlib. IMO, such things are better off as a third party library. I’m afraid this will create a precedent for all sorts of context manager bloat in the standard library.
I’m +0 to adding a general “modification context” context manager, as Alyssa suggests, though.
If we do add a “modification context” CM, it would be reasonable to include a umask as an example recipe in the docs, along with an “attribute setter”.
But I don’t think umask is important enough to warrant being in the stdlib in its own right. And I’m even less convinced that the recently suggested euid should be. Apart from them being too niche to warrant being included, they would most likely be used in security-sensitive areas, and do we want to deal with the inevitable issues where people fail to use them correctly (for example, expecting them to help on Windows)?
While I agree that contextlib.chdir() isn’t the best place for this functionality [1], I also am not sure shutil is the best new home. That module’s description starts out with:
The shutil module offers a number of high-level operations on files and collections of files. In particular, functions are provided which support file copying and removal. For operations on individual files, see also the os module.
Trying to beginners mind this bikeshed, I’d likely look in os first, but then there are problems with name collisions and semantic compatibility there as well. We’re probably better off with an entirely new module for these “os utilities”, which allows us to go with the shorter much more convenient naming as well.
My first thought is a new submodule called os.utils but since os and os.path are oddballs, that also might not be ideal. Maybe osutils or oslib documented under Generic Operating System Services and defined as “Higher level operating system utilities”. Using a new module name would also allow for a backported PyPI library for older Pythons.
I’d also strongly suggest just using the obvious and short names for functions. I much rather write something like:
from osutils import umask, chdir
with umask(0o007):
with chdir(my_working_dir):
do_stuff()
than any of the longer options in the list.
and yet, also agree that it really shouldn’t be removed for backward compatibility purposes, even if there is a preferred rehomed alias ↩︎
Hmm, utils is another term that I’d consider has no meaning at all. It doesn’t tell you what goes in os and what goes in osutils.
I already don’t think the distinction between os, os.path and shutil is clear cut. os contains file operations. os.path also contains file operations. shutil contains higher level file operations but it requires a fairly in depth knowledge of the underlying C APIs to know that recursively copying/deleting a directory is not a native OS function which makes it high level. Do we really want a fourth namespace?
You’re like 50 posts into this thread. umask is part of Posix, and shutil is suited for these goodies. Its usecase and advantages have been established.
Agreed, that’s what has been distilled from this thread (and the chdir one) that there’s a common theme among them. chdir and _RedirectStream are already in contextlib and could do to inherit from a common class AbstractSubstitutionContextManager as I said above.
Literally 8 lines.
class substitute_umask(SubstitutionContext):
"""Non thread-safe context manager to change the process's umask."""
def apply(self):
return os.umask(self._applied_value)
def revert(self, previous_value):
os.umask(previous_value)