Hi,
The io.TextIOWrapper documentation claims that the default encoding is locale.getencoding()
. The getencoding
function ignores UTF-8 mode. But I thought the whole point of UTF-8 mode was to make UTF-8 encoding the default, especially for text IO.
Example:
$ LC_ALL=en_US.ISO8859-1 python -X utf8
Python 3.11.0rc2 (main, Sep 13 2022, 00:00:00) [GCC 12.2.1 20220819 (Red Hat 12.2.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from locale import getencoding
>>> from pathlib import Path
>>> getencoding()
'ISO-8859-1'
>>> Path("utf8.txt").read_text(encoding='ISO-8859-1')
'ĦêÅ\x82Å\x82ø\n'
>>> Path("utf8.txt").read_text(encoding='UTF-8')
'Ħêłłø\n'
>>> Path("utf8.txt").read_text()
'Ħêłłø\n'
I see that locale.getpreferredencoding(False)
was changed to locale.getencoding()
in this part of the documentation when encoding="locale"
was made to ignore UTF-8 mode (@methane). But this means encoding="locale"
makes the encoding locale.getencoding()
, not encoding=None
, right?