Continuing the discussion from Str.dedent vs str.removeindent:
I have been wondering, given my own distaste for those methods (explained earlier in that thread) - have there been any surveys of the use of text-like
bytes methods in 3.x code “in the wild”? I understand that it doesn’t accomplish much to just remove existing functionality (I can’t see how it would meaningfully allow for optimization), and I assume that changing the
repr for the type is a non-starter - which is why I’m not posting in Ideas or anything.
But I’d be interested to see some motivating examples for why so much support was kept around for (mis-)treating
bytes as a textual type. I thought that a big part of the reason for bumping the major version number and introducing all the breaking changes at once, was exactly so that people would be forced to confront those kinds of bad habits.
Are you looking for justification for string like API fir bytes?
I work on HTTP requests with python and use bytes objects and their string like methods all the time. You can see that type of usage in the twisted library for example.
Yes, precisely that sort of thing. I was hoping for more concrete references, because I don’t really know where exactly to look.
From my point of view its a mandatory feature to have strings like methods on bytes.
The core devs also felt the same when the methods where added.
As we all know a lot of care is taken to not increase the costs of maintaining python.
Are you saying that these decisions where wrong?
@kknechtel was very clear about what they were asking:
As such I find this counter question extremely confusing.
I don’t have in the wild examples, but I do like having the string-ish ‘strip” methods available, since you can specify what to remove. Upper/lower-casing always seemed a little weird to me, though.
The reason was to make some breaking changes that we thought made sense while still trying to make it somewhat reasonable to port preexisting code over. Since
str in Python 2 was being used for binary data already, ripping out methods people were using for bytes-like data from
bytes itself would have made the transition even harder. Hence why Python 3
bytes is effectively Python 2
bytes and bytearray ‘text’ methods are most needed for low-level operations where one reads a block, examines and maybe alters some bytes, and retransmits – and speed is essential. For most Python programmers who enjoy the object model, working directly in encoded bytes – and sometimes bits – is not as fun. But we should appreciate the people who do such work.
HTTP headers are defined as case-blind ASCII.
It common to see a header written as camel case but compared as lower case.
header = b’ContentLength: 457’
If header.split(b’:’).lower() == b’contentlength’:
That makes sense, today I learned! Great example.
At work we execute bytes.lower around 650x10**9 times a day I estimate.
I’m intrigued by the specificity there, is that actually a meaningful estimate or just a random large number?
Its based on the transaction rate we process with an assumption about the number of headers on each HTTP request on average.
I used 10 headers on the request and 10 on the response.
I seem to remember Mercurial was a big proponent of string-like bytes methods, and holdout on Python 3 migration until
bytes % args was added in Python 3.5.