Add legacy_text_encoding option to make UTF-8 default

Non UTF-8 platforms are always fragile. There is no way to distinguish the final output is file or terminal or both (tee). User may use en_US.US-ASCII locale from UTF-8 terminal.

Other options:

a. Use UTF-8 always on Unix. os.device_encoding() is not UTF-8 only when PYTHONLEGACYWINDOWSSTDIO is enabled and the fd is Console.

  • Unix users need to use PYTHONIOENCODING when they want to use non UTF-8 stdio.
  • Works very nice with tools using UTF-8 regardless locale (node, rust, Go, Java (>=18), etc…)
  • Don’t work nice with tools using locale encoding when locale is not UTF-8.

b. Do not touch stdio and subprocess.PIPE at all.

  • somescript.py -o outfile.txt and somescript.py > outfile.txt may use different encoding.
    • It would be confusing for new users…
  • User need to use PYTHONIOENCODING when they want to use UTF-8 stdio. (status quo)
  • Works nice with tools using locale encoding.
  • If we want to change the stdio encoding in the future, another breaking change is required.

Maybe, (b) is the most conservative approach. And that is what I thought when I wrote PEP 597 EncodingWarning…