Non UTF-8 platforms are always fragile. There is no way to distinguish the final output is file or terminal or both (tee). User may use en_US.US-ASCII locale from UTF-8 terminal.
Other options:
a. Use UTF-8 always on Unix. os.device_encoding() is not UTF-8 only when PYTHONLEGACYWINDOWSSTDIO is enabled and the fd is Console.
- Unix users need to use PYTHONIOENCODING when they want to use non UTF-8 stdio.
- Works very nice with tools using UTF-8 regardless locale (node, rust, Go, Java (>=18), etc…)
- Don’t work nice with tools using locale encoding when locale is not UTF-8.
b. Do not touch stdio and subprocess.PIPE at all.
-
somescript.py -o outfile.txtandsomescript.py > outfile.txtmay use different encoding.- It would be confusing for new users…
- User need to use PYTHONIOENCODING when they want to use UTF-8 stdio. (status quo)
- Works nice with tools using locale encoding.
- If we want to change the stdio encoding in the future, another breaking change is required.
Maybe, (b) is the most conservative approach. And that is what I thought when I wrote PEP 597 EncodingWarning…