Consider deprecating and eventually removing `-b` CLI flag

Recently, while working on the JIT’s constant eveluation (gh-132732: Treat `bytes` as constants in `_Py_uop_sym_is_safe_const` by sobolevn · Pull Request #136033 · python/cpython · GitHub) I realized that -b command line flag might not be very useful now.

What it does?

Issue a warning when converting bytes or bytearray to str without specifying encoding or comparing bytes or bytearray with str or bytes with int. Issue an error when the option is given twice (-bb).

Docs: 1. Command line and environment — Python 3.13.5 documentation

It was originally introduced in 2007 as a TypeError, while it was still Python2 era, one year before Python3 release. Original commit: Make it an error to compare a bytes object and a Unicode object. · python/cpython@18c3ff8 · GitHub

Then converted to a warning in Merging the py3k-pep3137 branch back into the py3k branch. · python/cpython@98297ee · GitHub

Later it was modified in 3.5 to also disallow int comparisions: Issue #23681: The -b option now affects comparisons of bytes with int. · python/cpython@1dd4982 · GitHub

The main reason for this warning, if I understand correctly, is to help Python2 → Python3 transition. Since bytes and str had a lot of changes, it was really useful back then.

But, why now is the time to re-consider this warning?

  1. Python2 is long gone, all transition tools are dropped: including lib2to3 and many others. -b is also a transition tool
  2. Now we have type checkers that do a better job at finding such cases, mypy:
byt = b'abc'
st = 'hello'
btar = bytearray()
num = 1

byt == st  # E: Non-overlapping equality check (left operand type: "bytes", right operand type: "str")
byt == num  # E: Non-overlapping equality check (left operand type: "bytes", right operand type: "int")
btar == num  # E: Non-overlapping equality check (left operand type: "bytearray", right operand type: "int")

Link: mypy Playground

Pyright:

byt = b'abc'
st = 'hello'
btar = bytearray()
num = 1

byt == st  # E: Condition will always evaluate to False since the types "Literal[b"abc"]" and "Literal['hello']" have no overlap  (reportUnnecessaryComparison)
byt == num  # E: Condition will always evaluate to False since the types "Literal[b"abc"]" and "Literal[1]" have no overlap  (reportUnnecessaryComparison)
btar == num  # E: Condition will always evaluate to False since the types "bytearray" and "Literal[1]" have no overlap  (reportUnnecessaryComparison)

Link: Pyright Playground

  1. It has to be special cased in the JIT
  2. It has to be special cased in subprocess.py cpython/Lib/subprocess.py at c419af9e277bea7dd78f4defefc752fe93b0b8ec · python/cpython · GitHub
  3. I don’t think that it is used that often, it is hard to search for -b, but I did the search for sys.flags.bytes_warning: Code search results · GitHub

My proposal is to deprecate -b CLI usage in 3.15 and remove it in 3.17
I also propose not to touch sys.flags.bytes_warning and always keep it as False
sys.warnoptions will still have 'default::BytesWarning' string, so warnopts.remove("error::BytesWarning") calls will keep working.

We can also remove the C code that issues this warning in 3.17 together with -b removal.

What do others think? Please, share your opinions.

13 Likes

For more background on the JIT special-casing:

We can currently evaluate comparisons between known values in the JIT when the values have known “safe” types. So any known int[1]/str/float/bool value compared with any other known int/str/float/bool value is safe to evaluate at JIT-compile-time.

Adding bytes to this list means we need to either check that the -b option isn’t active when compiling and branch on that, or rework the code to check for int/str on the other side of the comparison. It’s not a super heavy lift, but it adds additional subtlety, complexity, and overhead to code that is already quite subtle, complex, and performance-sensitive.


  1. …provided the int is “compact”. Betcha didn’t know that arbitrary code can run during operations on huge integers! :wink: ↩︎

7 Likes

+1. Sounds like a good idea.

2 Likes

I agree that it’s time to deprecate this feature.

Correct, BytesWarning is mostly useful to convert a Python 2 code base using str/unicode to Python 3 with bytes/str. Once you have a Python 3 code base using correctly bytes and str, the warning is less useful (or just useless).

4 Likes

Please consider keeping the flag forever (doing nothing) instead of removing it entirely.

2 Likes

I am open to that idea. Can you please share your ideas of potential benefits?

There is a prior example: the removal of the -t command line option, commit. The option is still accepted but does nothing.

2 Likes

The obvious benefit is that if there is a script somewhere that calls Python with the -b flag, it won’t stop working. Maintaining a no-op flag has an almost zero cost, so I see no benefit in not doing that. Also, if the flag is kept, there is no way somebody will introduce a new flag with the same name in the future.

Anyway, I guess the main idea is to follow “don’t break it if you really don’t need to”.

6 Likes