Non UTF-8 platforms are always fragile. There is no way to distinguish the final output is file or terminal or both (tee
). User may use en_US.US-ASCII
locale from UTF-8 terminal.
Other options:
a. Use UTF-8 always on Unix. os.device_encoding()
is not UTF-8 only when PYTHONLEGACYWINDOWSSTDIO
is enabled and the fd is Console.
- Unix users need to use PYTHONIOENCODING when they want to use non UTF-8 stdio.
- Works very nice with tools using UTF-8 regardless locale (node, rust, Go, Java (>=18), etc…)
- Don’t work nice with tools using locale encoding when locale is not UTF-8.
b. Do not touch stdio and subprocess.PIPE at all.
-
somescript.py -o outfile.txt
andsomescript.py > outfile.txt
may use different encoding.- It would be confusing for new users…
- User need to use PYTHONIOENCODING when they want to use UTF-8 stdio. (status quo)
- Works nice with tools using locale encoding.
- If we want to change the stdio encoding in the future, another breaking change is required.
Maybe, (b) is the most conservative approach. And that is what I thought when I wrote PEP 597 EncodingWarning…