I’ve been going through Stack Overflow trying to close and re-route a bunch of old duplicate questions… I discovered that when code has tabbed indentation followed by spaced indentation, this only gets reported as a TabError
when the code tries to use 8 spaces to match the tab.
All examples below are taking from the Python 3.8 REPL; I also tried in 3.11 and it’s all the same - except that TabError
s don’t show a ^
in the message, and the IndentationError
’s caret is one space further to the right. Notably, either way, that caret points to the end of line, including comment, which still doesn’t seem useful.
On to examples:
>>> def tabfirst_4():
... pass # tab
... pass # 4 spaces
File "<stdin>", line 3
pass # 4 spaces
^
IndentationError: unindent does not match any outer indentation level
>>>
>>> def tabfirst_8():
... pass # tab
... pass # 8 spaces
File "<stdin>", line 3
pass # 8 spaces
^
TabError: inconsistent use of tabs and spaces in indentation
Fair enough; my understanding from the documentation is that Python 3 still considers a tab to be “equivalent to” 8 spaces (actually, space up to the next multiple-of-8 tab stop) in some sense:
Tabs are replaced (from left to right) by one to eight spaces such that the total number of characters up to and including the replacement is a multiple of eight (this is intended to be the same rule as used by Unix). The total number of spaces preceding the first non-blank character then determines the line’s indentation. Indentation cannot be split over multiple physical lines using backslashes; the whitespace up to the first backslash determines the indentation.
Indentation is rejected as inconsistent if a source file mixes tabs and spaces in a way that makes the meaning dependent on the worth of a tab in spaces; a TabError is raised in that case.
But here I have multiple objections.
-
What is even the benefit of keeping around the first calculation (which is, as I recall, identical to how it was in 2.x)? The number 8 is not in any way special to the indentation system; even someone who chose to mix tabs and spaces “responsibly” would presumably use the same pattern of tabs and spaces, such that the “weight” of the tabs would be irrelevant.
-
Arguably, it causes harm:
>>> def mixed_8(): ... pass # a tab, followed by 8 spaces ... pass # 8 spaces, followed by a tab ... >>> # no error!!!
This only works when the number of spaces is a multiple of 8, of course. It’s almost as if special treatment is being afforded to people who want an 8-space indent; they get to use tabs to represent those indents, and interchange them with 8-space blocks freely.[1]
-
When spaces come first, accepting the above reasoning, one would expect a
TabError
for 8-space indent followed by tab-indent, but a baseIndentationError: unexpected indent
for 4-space indent followed by tab-indent. After all, in the latter case, a 4-space indent was followed by something equivalent to 8-space indent. Right?But that doesn’t happen:
>>> def spacefirst_4(): ... pass # 4 spaces ... pass # tab File "<stdin>", line 3 pass # tab ^ TabError: inconsistent use of tabs and spaces in indentation >>> def spacefirst_8(): ... pass # 8 spaces ... pass # tab File "<stdin>", line 3 pass # tab ^ TabError: inconsistent use of tabs and spaces in indentation
Strangely, now the error is consistent - the mixed indentation problem is detected first. Why is this inconsistent - why is mixed indentation detected first in this case, but not in the other case?
Obviously, disallowing tabs will break outstanding code, and there are presumably codebases out there that have interchanged tabs with 8-space blocks that would also break, and don’t want the maintenance burden of fixing that terrible indentation style.
But surely the mixed-indentation check could at least come first consistently? That would preempt a ton of questions from beginners who wrote the equivalent of tabfirst_4
, and see aligned text in their editor.[2]