Deprecating codecs.open?

codecs.open was a way to open text files that worked in Python 2, but with the introduction of io.open, its significance has greatly diminished.

Now, difference between codecs.open and TextIOWrapper is a source of paper cuts.

It seems we do not have enough resources to maintain the codecs module.

How do you think about deprecating codecs.open, StreamReader and StreamWriter?
Does they have any use cases that cannot be replaced with open and io.TextIOWrapper?

Since they have replacements from old days,
I think 3 years deprecation in document + 3 years DeprecationWarning is enough for remove them.

2 Likes

Searching the top 15k PyPI projects (downloaded today) for codecs\.open:

  • Found 1,567 matching lines in 613 projects

\bStreamReader\b:

  • Found 2,959 matching lines in 247 projects

\bStreamWriter\b:

  • Found 2,565 matching lines in 185 projects

I can share the details if wanted, it’s too big for Discord.

2 Likes

Most StreamReader and StreamWriter are asyncio or aiohttp.

Anyway, it is difficult to remove codecs.open and StreamReader/Writer cannot be deprecated unless codecs.open deprecated.

How about deprecate codecs.open without removal schedule?
Like alias methods in TestCase, we need very long deprecation period. Maybe 10+ years.

2 Likes

I think you are forgetting that StreamRead/Writer play a central part in the whole codecs sub-system, so eventually removing them would require a lot more redesign work for the sub-system to continue working (both are part of the what defines a codec in Python - see CodecInfo)

You’d have to essentially replace the StreamReader/Writer logic which the codec sub-system uses with io stack classes - that is, after investigating whether this is easily possible. They look fairly similar, but their method signatures are different, TextIOWrapper does not separate reading and writing and the semantics are different as well.

So overall, I think the idea of simply deprecating the two base classes is premature at this point.

Now I am proposing deprecating only codecs.open. I updated the thread title.

In 2010, I wrote PEP 400 – Deprecate codecs.StreamReader and codecs.StreamWriter. See related discussions:

1 Like

CodecInfo.streamreader and CodecInfo.streamwriter are only used by codecs.open(). Would you mind to elaborate what do you mean by “play a central part in the whole codecs sub-system”?

It’s possible to emit a DeprecationWarning in StreamReader and StreamWriter constructor without breaking Python. It remains possible to define sub-classes (without emitting DeprecationWarning) which is needed to define codecs such as UTF-8 (Lib/encodings/utf_8.py):

class StreamWriter(codecs.StreamWriter):
    encode = codecs.utf_8_encode

class StreamReader(codecs.StreamReader):
    decode = codecs.utf_8_decode

Every single codec in Python exposes subclasses of these two base classes via the CodecInfo returned by the codec search function. PEP 100 has the details.

+1 on soft deprecating codecs.open(), without actually removing the function for a longer while. The standard open() is the better choice these days.

Still, the function is still in wide spread use, so it’ll take a longer while to convince people to reconsider their choice. It is still needed by projects wanting to maintain Python 2 compatibility.

Note that deprecating the function will not allow deprecating StreamReader/Writer as a result, unless there’s a working migration path forward to e.g. use new base classes around the io stack for the codecs. These could be added as additional fields in CodecInfo and have codecs slowly migrate over with a longer deprecation period

1 Like