Report a possible problem with string methods

The following code gives the same output in value 1 and 2, which is a problem. Try the code below and see.

txt = “H\te\tl\tl\to”
print(" :"+txt)
txt = “H\te\tl\tl\to”
print(“00:”+txt.expandtabs(0))
txt = “H\te\tl\tl\to”
print(“01:”+txt.expandtabs(1))
txt = “H\te\tl\tl\to”
print(“02:”+txt.expandtabs(2))
txt = “H\te\tl\tl\to”
print(“03:”+txt.expandtabs(3))
txt = “H\te\tl\tl\to”
print(“04:”+txt.expandtabs(4))
txt = “H\te\tl\tl\to”
print(“05:”+txt.expandtabs(5))
txt = “H\te\tl\tl\to”
print(“06:”+txt.expandtabs(6))
txt = “H\te\tl\tl\to”
print(“07:”+txt.expandtabs(7))
txt = “H\te\tl\tl\to”
print(“08:”+txt.expandtabs(8))
txt = “H\te\tl\tl\to”
print(“09:”+txt.expandtabs(9))
txt = “H\te\tl\tl\to”
print(“10:”+txt.expandtabs(10))

The result from your code is exactly as expected.

expandtabs does not say the number of spaces to replace a tab character with. Because “expanding tabs” does not mean replacing every tab with a specific number of spaces.

Historically, the idea of a “tab” goes back to the typewriter. There were physical “tab stop” devices attached to the carriage, so that when you pressed the tab key, the carriage would mechanically move forward, until the stop… stopped it. The idea is to make it easy to type text that appears in columns.

txt.expandtabs(n), for n > 0, means that the tabs are turned into a number of spaces, 1 or more, such that the text advances to the next multiple of n total length. In other words, it emulates having tab stops set every n spaces.

Every positive integer is a multiple of 1, so txt.expandtabs(1) turns every tab into a single space.

txt.expandtabs(2) turns tabs into either 1 or 2 spaces, as needed, so that the text after it starts at an even-numbered position. If every piece in between the tabs is a single character, then they will get single spaces in between, because a single character plus a single space lines makes two characters, getting us to the next tab stop boundary.

You can see the differences if you use different amounts of text (including none at all) between some of the tab characters.

1 Like

How is it that there is one more space than input two for input three and one more space than input three for input four? But for input 2 and 1, the number of spaces created is equal to one. If you enter the number 1 or put the number 2 as input in the above codes, you will get the same answer, but it is not logically correct. This is a bit confusing

The number of spaces inserted by expandtabs(n) is essentially

n - number_of_ characters_before_the_tab

In your examples, you always have one character before each tab, so n - 1 spaces are inserted. In the case of expandtabs(1), that would be 0 spaces. A full tabstop’s worth of spaces (i.e., 1) is inserted instead of 0.

Keep in mind that the real formula is more involved as it needs to handle runs of characters that are longer than one or more tabstops.

The same way as when you use the same argument for expandtabs, and change the amount of text in the string before the tab:

>>> 'a\tz'.expandtabs(4)
'a   z'
>>> 'ab\tz'.expandtabs(4)
'ab  z'
>>> 'abc\tz'.expandtabs(4)
'abc z'
>>> 'abcd\tz'.expandtabs(4)
'abcd    z'
>>> 'abcde\tz'.expandtabs(4)
'abcde   z'
>>> 'abcdef\tz'.expandtabs(4)
'abcdef  z'
>>> 'abcdefg\tz'.expandtabs(4)
'abcdefg z'
>>> 'abcdefgh\tz'.expandtabs(4)
'abcdefgh    z'
>>> 'a\tbc\tdef\tghij\tz'.expandtabs(4)
'a   bc  def ghij    z'

See the pattern? The tab has to become at least one space, but it can become a different amount of spaces, so that the next character is aligned at a multiple of four. This makes it look like the text is in columns that are 4 characters wide, but makes sure that the column texts don’t run into each other.

If you want the rule “every tab should be replaced with a specific number of spaces” instead, then use the .replace method.

1 Like

You can think of it as pressing the Tab key to advance to the next column, with the argument being the width of the columns.

A tab always advances the position to the next column, even if it’s already at the start of a column.

This example might make it clearer:

>>> t = 'ab\tc\tde'
>>> t.expandtabs(5)
'ab   c    de'
#^    ^    ^  Every 5 characters
>>> t.expandtabs(4)
'ab  c   de'
#^   ^   ^  Every 4 characters
>>> t.expandtabs(3)
'ab c  def'
#^  ^  ^  Every 3 characters
>>> t.expandtabs(2)
'ab  c de'
#^ ^ ^ ^  Every 2 characters
>>> t.expandtabs(1)
'ab c de'
#^^^^^^^  Every 1 characters
1 Like

I can’t find a good explanation for this.
See the picture below, I don’t know the logic.
I would like an explanation for the values 0, 1 to 4, 4 to 11, and 12 to 28

You have gotten multiple explanations in this thread by now. Could you be a bit more specific where you think the explanations are inaccurate?

2 Likes

I wanted a brief description. To tell why entry zero is without a space character and entry 1 to 4 has a space character and entry 1 to 11 is added in order, but suddenly from entry 12 to the end it is backtracked and then one is added in order.
I want to know the logic of this work.
What is the formula for arranging spaces?

And the other answers provide exactly that, no? I don’t really see a point in repeating it a third time if you are just going to ignore it.

4 Likes

Others have already explained the general mechanism. In your specific example, excluding the special case of tabsize = 0 (which simply replaces each tab character with an empty string), the number of spaces added for each tab character is tabsize - 11 % tabsize, where tabsize is the value of the argument to expandtabs.

>>> for tabsize in range(1, 30):
...     print(f"{tabsize} >> {tabsize - 11 % tabsize}")
... 
1 >> 1
2 >> 1
3 >> 1
4 >> 1
5 >> 4
6 >> 1
7 >> 3
8 >> 5
9 >> 7
10 >> 9
11 >> 11
12 >> 1
13 >> 2
14 >> 3
15 >> 4
16 >> 5
17 >> 6
18 >> 7
19 >> 8
20 >> 9
21 >> 10
22 >> 11
23 >> 12
24 >> 13
25 >> 14
26 >> 15
27 >> 16
28 >> 17
29 >> 18

Explanation: your string can be analysed in sections: each of the first 4 sections starts from a tab stop, then has 11 non-tab characters, then a tab character. On expanding tabs, each tab character converts to the minimum positive number n of spaces so that 11 + n is a multiple of tabsize, and that number n can be calculated using the formula tabsize - 11 % tabsize.

1 Like

The best answer