PEP 597: Use UTF-8 for default text file encoding

I still believe the backwards compatibility impact is bad enough to not make this worthwhile, as we decided while discussing my two encoding PEPs (528 and 529, from memory).

My impression is that the rationales in this proposal have not yet been validated, and are based on a narrow view of Python’s userbase. We should not so lightly cause user data to become “corrupt” between Python 3.8 and Python 3.9.

I always teach that if you don’t know the encoding of a file, you can’t read that file, so make somebody tell you. If you’re writing the file, either someone will demand a particular encoding, or you should choose (and often UTF-8 is a good choice). I would prefer to make the encoding parameter required, if we’re going to make a breaking change here!

That said, if this is going to be voted in anyway like everything else recently I’ve had concerns about, having a “best guess current locale” encoding (which is not the same as the console code page anyway) and a noisy warning for unspecified encoding in open(), etc., is about the best way to make the transition painful enough now that any code currently being maintained will be ready for when 3.9 is released.

1 Like