Change environment variable style

corona10 · October 2, 2023, 2:46pm

While introducing a new environment variable for overriding the CPU count from os.cpu_count / os.process_cpu_count, I noticed that some people want to change the environment variable style from FULLYCONCATEDSTYLE to UNDER_SCORE_STYLE due to the possibility of typos and poor readability.

Even if we decide not to think about migration of the old style, we can declare to use new style for new environment variables(e.g PYTHON_CPU_COUNT), and it will improve the readability and possibility of typos than PYTHONCPUCOUNT

If there is no serious objection to using new style, PYTHON_CPU_COUNT will be the first environment variable that uses the underscore style.
I would like to listen to the voices of people if there is any concern with using the underscore style of environment variables from now on.

cc @vstinner @hugovk @ambv

Use underscore for new vars PYTHON_CPU_COUNT, leave old names unchanged
Accept both names for old vars, use underscore for new names
Do nothing: continue using PYTHONWITHOUTUNDERSCORE style

0 voters

hugovk · October 2, 2023, 3:01pm

I support using underscores for new variables. In addition to improving readability, it should also improve accessibility, including for people using with screen readers.

As a rough demo, here’s how the VoiceOver screen reader on macOS deals with PYTHONCPUCOUNT and PYTHON_CPU_COUNT:

[video]

It really struggles without underscores: PYTHONCPUCOUNT sounds something like “pithon-ki-count”! Whereas PYTHON_CPU_COUNT is much better: “python underscore C P U underscore count”.

vstinner · October 2, 2023, 3:08pm

Use underscore for new vars PYTHON_CPU_COUNT, leave old names unchanged

There is an alternative for typos in “old” names: emit a warning if Python gets a env var starting with PYTHON which is unknown. The warning is emitted by default, but you can opt-in to make it quiet.

vstinner · October 2, 2023, 3:33pm

My issue with long names is that I often make typos and it’s hard for me to check if I made a typo or not.

Names of Python environment variables, longest (25 characters) to shortest (10):

PYTHONWARNDEFAULTENCODING
PYTHONDONTWRITEBYTECODE
PYTHONINTMAXSTRDIGITS
PYTHONCOERCECLOCALE
PYTHONNODEBUGRANGES
PYTHONPYCACHEPREFIX
PYTHONFAULTHANDLER
PYTHONBREAKPOINT
PYTHONIOENCODING
PYTHONNOUSERSITE
PYTHONPLATLIBDIR
PYTHONUNBUFFERED
PYTHONHASHSEED
PYTHONOPTIMIZE
PYTHONSAFEPATH
PYTHONWARNINGS
PYTHONDEVMODE
PYTHONINSPECT
PYTHONSTARTUP
PYTHONVERBOSE
PYTHONCASEOK
PYTHONMALLOC
PYTHONDEBUG
PYTHONHOME
PYTHONPATH
PYTHONUTF8

With underscores, it can look like:

PYTHON_WARN_DEFAULT_ENCODING
PYTHON_DONT_WRITE_BYTECODE
PYTHON_INT_MAX_STR_DIGITS
PYTHON_COERCE_C_LOCALE
PYTHON_NO_DEBUG_RANGES
PYTHON_PYCACHE_PREFIX (PY_CACHE?)
PYTHON_FAULTHANDLER (fault… handler?)
PYTHON_BREAKPOINT (break… point?)
PYTHON_IO_ENCODING
PYTHON_NO_USER_SITE
PYTHON_PLATLIBDIR (PLAT_LIB_DIR?)
PYTHON_UNBUFFERED
PYTHON_HASHSEED (hash… seed?)
PYTHON_OPTIMIZE
PYTHON_SAFE_PATH
PYTHON_WARNINGS
PYTHON_DEV_MODE
PYTHON_INSPECT
PYTHON_STARTUP
PYTHON_VERBOSE
PYTHON_CASE_OK
PYTHON_MALLOC
PYTHON_DEBUG
PYTHON_HOME
PYTHON_PATH
PYTHON_UTF8

I reported to @corona10 that the very first time that I tried his new command line, I made a typo -X cpucount. It’s surprising since the affected functions have an underscore: os.cpu_count() and os.process_cpu_count(), the -X cpu_count has an underscore, but the PYTHONCPUCOUNT has not. So I wrote -X cpucount. This confusion already exists in existing four -X options with underscore and their related env var:

-X frozen_modules: (no env var, so no confusion?)
-X int_max_str_digits: PYTHONINTMAXSTRDIGITS (my eyes! ouch!), it’s plural, right? S suffix, right?
-X no_debug_ranges: PYTHONNODEBUGRANGES, lovely double NN
-X pycache_prefix: PYTHONPYCACHEPREFIX (underscore vs no underscore)
-X warn_default_encoding: PYTHONWARNDEFAULTENCODING (underscore vs no underscore)

Other -X options without underscores:

-X dev: PYTHONDEVMODE, env var gets an additional MODE suffix.
-X faulthandler: PYTHONFAULTHANDLER.
-X importtime: PYTHONPROFILEIMPORTTIME, surprise! the env var gets an additonal PROFILE!
-X perf: PYTHONPERFSUPPORT, env var gets an additional SUPPORT suffix
-X showrefcount
-X tracemalloc: PYTHONTRACEMALLOC
-X utf8: PYTHONUTF8

vstinner · October 2, 2023, 3:37pm

Empirical study of the 48 env vars of my Fedora 38 Linux system:

48% (23) are with underscores
29% (14) are without underscores
23% (11) are single words

23 env vars with an underscore:

CONDA_SHLVL
DBUS_SESSION_BUS_ADDRESS
DEBUGINFOD_URLS
DESKTOP_SESSION
GDM_LANG
GNOME_SETUP_DISPLAY
GNOME_TERMINAL_SCREEN
GNOME_TERMINAL_SERVICE
LS_COLORS
MOZ_GMP_PATH
QT_IM_MODULE
SESSION_MANAGER
SSH_AUTH_SOCK
SYSTEMD_EXEC_PID
VTE_VERSION
WAYLAND_DISPLAY
XDG_CURRENT_DESKTOP
XDG_DATA_DIRS
XDG_MENU_PREFIX
XDG_RUNTIME_DIR
XDG_SESSION_CLASS
XDG_SESSION_DESKTOP
XDG_SESSION_TYPE

14 without underscores:

COLORTERM
GDMSESSION
HISTCONTROL
HISTFILESIZE
HISTSIZE
HOSTNAME
KDEDIRS
LESSOPEN
LOGNAME
MAKEFLAGS
SHLVL (what is that?)
USERNAME
XAUTHORITY (single word?)
XMODIFIERS (single word?)

And 11 are just single words:

DISPLAY
EDITOR
HOME
LANG
MAIL
PATH
PS1
PWD
SHELL
TERM
USER

malemburg · October 2, 2023, 3:57pm

This is not going to work out if you are regularly using multiple Python versions. Suppose you add this to 3.13, then as soon as Python 3.13 sees a variable that was added for 3.14, it will complain. So you add an env var which silences the warning… in both 3.13 and 3.14.

I still think it’s a good idea to slowly migrate over to snake case for env vars (and perhaps elsewhere as well). We won’t ever reach consistency, but could at least be more consistent going forward.

ambv · October 2, 2023, 4:05pm

Allowing aliases with underscores for old names makes sense on the surface but it introduces some new issues:

Which name takes precedence if both are present? It’s not as simple as “the new one” because it might have been the old spelling that is added after. Some systems will compose an environment from multiple levels of configuration, in which case it’s also not entirely clear, which one should take precedence. Zen of Python would tell us that the process should then refuse to start, and that would be probably wisest, even if it is disruptive. But it would be disruptive.
Software ran in the same environment would change behavior between 3.12 and 3.13 if the new spelling of old variables was used. This will lead to user error when testing things, especially for open-source maintainers who need to test on multiple Python versions.
Better yet, imagine there is a variable in the user’s environment with a typo, it uses underscores even though Python currently ignores such variables. Until now, that variable was never really taken into account but the user never noticed because everything worked. And boom, in 3.13 the variable is suddenly interpreted, and the user’s program breaks mysteriously.

Therefore, even though it would be cleaner to have underscores in old variables, I personally don’t feel like the transition pain is worth it.

eric.snow · October 2, 2023, 6:00pm

This would take consideration of at least the following:

allow either name
if both are set then underscore name takes precedence
if both are set then emit a warning? (only if not the same?)
deprecation warning for old names?
allow disabling any above warnings
if only one is set then set the other to match?

Those are potential new costs, in exchange for less likely typos (and better readability) in env var names. I’m not sure that pays for itself in the community, though personally I’d prefer the underscore names.

Regardless, there’s no urgency. For now we could make sure new env vars have underscores and deal with old env vars in the future.

All of the existing env vars have the old spelling. What do you mean here? An example would help.

I’m not sure I follow. Do you have an example?

When both are set, a warning could be helpful without being so disruptive. Perhaps we only emit it with -X dev. Perhaps we don’t emit if they are the same?

If we were to make using both a warning then we could also provide something like -X no_warn_env_vars or PYTHON_NO_WARN_ENV_VARS to hide the warning. That would help maintainers preserve a single test config between versions.

We should weigh this against the cost of the existing env vars (e.g. typos).

methane · October 3, 2023, 6:13am

>>> len('PYTHONLEGACYWINDOWSFSENCODING')
29
>>> len('PYTHONLEGACYWINDOWSSTDIO')
24

ambv · October 3, 2023, 10:39am

My entire message is about:

How could I improve clarity of what I said?

pitrou · October 3, 2023, 7:17pm

By adding underscores perhaps?

hugovk · October 9, 2023, 10:18pm

Result: 72% chose “Use underscore for new vars PYTHON_CPU_COUNT, leave old names unchanged”.

corona10 · October 9, 2023, 11:28pm

Yeah, the poll result says to leave old names as unchanged we can discuss the migration issue later if we think that it is need to be.