Bikeshedding: a method to refresh os.environ

It’s also imported by site, so generally you never get a chance to defer it (pro-tip, python -X importtime -c "pass" to find this out).

4 Likes

getenv (the glibc function, not os.getenv in python) is implemented by looping over the extern char **environ and strncmping each name until a match is found. This is

  1. thread-unsafe
  2. very slow (and gets progressively slower as the environment grows)

Unfortunately, it does. It is fundamentally impossible to get a “live” state of the environment in a thread-safe way on POSIX. Adding a thread-lock/critical section in CPython is not sufficient, because the underlying extern char **environ pointer may be read or written in foreign threads (running code written in other languages that won’t respect the CPython lock).

4 Likes

It seems part of the concern is with the method on os.environ being an attractive nuisance, with folks calling it when they don’t need to and at best suffering an avoidable performance loss, and at worst introducing genuine race conditions where they didn’t previously exist.

If I’ve understand that concern correctly, would providing the functionality as a module level os.reload_environment() function rather than as an os.environ method help alleviate those concerns?

Having it as a method on os.environ feels weird to me anyway, as os.environ, os.environb, and os.getenv all access the same underlying cache, and hence reloading that cache affects all of them, not just os.environ.

9 Likes

Yes, I do think that is better. Especially given your latter point about the impact being beyond the single instance the method otherwise appears to apply to.

6 Likes

I find Brett’s rebuild_cache name much more effective at making this clearer

2 Likes

The key property of a cache, as I understand the word, is that it automatically makes sure it has up to date information.
By os.environment does not do that, by design.

IMO .refresh() is just fine as a name, because the intuition people have (of refreshing their desktop, or refreshing a page), is to fetch newer information, implying that the information currently visible is inherently prone to becoming stale. So refresh() in my mind is just a succint way of saying rebuild_cache().

3 Likes

I personally have never needed anything like this .refresh method but I can see that it solves a problem that I can imagine would arise for some other people. The amount of bikeshedding about this in the previous thread and now here is absurd though. Time would have been better spent just moving on after Victor’s original PR.

Hardly anyone needs this. Those who do can read the docs, stackoverflow etc. The exact choice of name from all the options discussed makes little difference in practice. No name will ever convey the full meaning without reading the docs and this will never be a common operation that many people need.

Just choose a name and make the docs clear about what it does!

4 Likes

It makes a pretty big difference to the people who get misled by it and the people who have to sort out the people who were misled by it (and I do think that .refresh() is misleading). Millions of people will read that name so even one that confuses 1% less people is worth bike shedding over.

It sounds to me that, since there’s no thread safe way to read or write real environment variables in Python or extensions to Python, even the C extensions shouldn’t be doing it. To that end, the original request that prompted this function being written not safe and I’d therefore be opposed to creating any standard library function to endorse it – irrespective of what we call it.

6 Likes

I think the name os.environ.refresh() is fine as it is.

Only the docs should be updated with a warning about thread-(un)safety of os.environ in general, to make it clear that bad things can happen… I don’t think this will often be an issue, just like I don’t think .refresh() will be needed often (I’ve never had such a need in 30 years of using Python), so taken all together, the world will continue spinning, even with this method in place, and if it makes some people happy, even better :smile:

9 Likes

We need to put a period here. Either reimplement this feature as a module level os.reload_environment() function (this looks to me the most reasonable idea of the suggested), or leave it as os.environ.refresh(). All other options look not having enough support to end the endless bikeshedding.

  • os.envoron.refresh() (keep the current code)
  • os.reload_environment() (a module level function)
  • other option / continue bikeshedding
0 voters
4 Likes

Is not having the feature at all still a vote-able option?

1 Like

It is the third option.