Report a possible problem with string methods

abdolrahmanshokri · December 30, 2023, 10:55am

The following code gives the same output in value 1 and 2, which is a problem. Try the code below and see.

txt = “H\te\tl\tl\to”
print(" :"+txt)
txt = “H\te\tl\tl\to”
print(“00:”+txt.expandtabs(0))
txt = “H\te\tl\tl\to”
print(“01:”+txt.expandtabs(1))
txt = “H\te\tl\tl\to”
print(“02:”+txt.expandtabs(2))
txt = “H\te\tl\tl\to”
print(“03:”+txt.expandtabs(3))
txt = “H\te\tl\tl\to”
print(“04:”+txt.expandtabs(4))
txt = “H\te\tl\tl\to”
print(“05:”+txt.expandtabs(5))
txt = “H\te\tl\tl\to”
print(“06:”+txt.expandtabs(6))
txt = “H\te\tl\tl\to”
print(“07:”+txt.expandtabs(7))
txt = “H\te\tl\tl\to”
print(“08:”+txt.expandtabs(8))
txt = “H\te\tl\tl\to”
print(“09:”+txt.expandtabs(9))
txt = “H\te\tl\tl\to”
print(“10:”+txt.expandtabs(10))

kknechtel · December 30, 2023, 11:05am

The result from your code is exactly as expected.

expandtabs does not say the number of spaces to replace a tab character with. Because “expanding tabs” does not mean replacing every tab with a specific number of spaces.

Historically, the idea of a “tab” goes back to the typewriter. There were physical “tab stop” devices attached to the carriage, so that when you pressed the tab key, the carriage would mechanically move forward, until the stop… stopped it. The idea is to make it easy to type text that appears in columns.

txt.expandtabs(n), for n > 0, means that the tabs are turned into a number of spaces, 1 or more, such that the text advances to the next multiple of n total length. In other words, it emulates having tab stops set every n spaces.

Every positive integer is a multiple of 1, so txt.expandtabs(1) turns every tab into a single space.

txt.expandtabs(2) turns tabs into either 1 or 2 spaces, as needed, so that the text after it starts at an even-numbered position. If every piece in between the tabs is a single character, then they will get single spaces in between, because a single character plus a single space lines makes two characters, getting us to the next tab stop boundary.

You can see the differences if you use different amounts of text (including none at all) between some of the tab characters.

abdolrahmanshokri · December 30, 2023, 11:24am

How is it that there is one more space than input two for input three and one more space than input three for input four? But for input 2 and 1, the number of spaces created is equal to one. If you enter the number 1 or put the number 2 as input in the above codes, you will get the same answer, but it is not logically correct. This is a bit confusing

dirn · December 30, 2023, 1:05pm

The number of spaces inserted by expandtabs(n) is essentially

n - number_of_ characters_before_the_tab

In your examples, you always have one character before each tab, so n - 1 spaces are inserted. In the case of expandtabs(1), that would be 0 spaces. A full tabstop’s worth of spaces (i.e., 1) is inserted instead of 0.

Keep in mind that the real formula is more involved as it needs to handle runs of characters that are longer than one or more tabstops.

kknechtel · December 30, 2023, 2:38pm

The same way as when you use the same argument for expandtabs, and change the amount of text in the string before the tab:

>>> 'a\tz'.expandtabs(4)
'a   z'
>>> 'ab\tz'.expandtabs(4)
'ab  z'
>>> 'abc\tz'.expandtabs(4)
'abc z'
>>> 'abcd\tz'.expandtabs(4)
'abcd    z'
>>> 'abcde\tz'.expandtabs(4)
'abcde   z'
>>> 'abcdef\tz'.expandtabs(4)
'abcdef  z'
>>> 'abcdefg\tz'.expandtabs(4)
'abcdefg z'
>>> 'abcdefgh\tz'.expandtabs(4)
'abcdefgh    z'
>>> 'a\tbc\tdef\tghij\tz'.expandtabs(4)
'a   bc  def ghij    z'

See the pattern? The tab has to become at least one space, but it can become a different amount of spaces, so that the next character is aligned at a multiple of four. This makes it look like the text is in columns that are 4 characters wide, but makes sure that the column texts don’t run into each other.

If you want the rule “every tab should be replaced with a specific number of spaces” instead, then use the .replace method.

MRAB · December 31, 2023, 2:22am

You can think of it as pressing the Tab key to advance to the next column, with the argument being the width of the columns.

A tab always advances the position to the next column, even if it’s already at the start of a column.

This example might make it clearer:

>>> t = 'ab\tc\tde'
>>> t.expandtabs(5)
'ab   c    de'
#^    ^    ^  Every 5 characters
>>> t.expandtabs(4)
'ab  c   de'
#^   ^   ^  Every 4 characters
>>> t.expandtabs(3)
'ab c  def'
#^  ^  ^  Every 3 characters
>>> t.expandtabs(2)
'ab  c de'
#^ ^ ^ ^  Every 2 characters
>>> t.expandtabs(1)
'ab c de'
#^^^^^^^  Every 1 characters

abdolrahmanshokri · December 31, 2023, 7:53am

I can’t find a good explanation for this.
See the picture below, I don’t know the logic.
I would like an explanation for the values 0, 1 to 4, 4 to 11, and 12 to 28

MegaIng · December 31, 2023, 8:48am

You have gotten multiple explanations in this thread by now. Could you be a bit more specific where you think the explanations are inaccurate?

abdolrahmanshokri · December 31, 2023, 11:30am

I wanted a brief description. To tell why entry zero is without a space character and entry 1 to 4 has a space character and entry 1 to 11 is added in order, but suddenly from entry 12 to the end it is backtracked and then one is added in order.
I want to know the logic of this work.
What is the formula for arranging spaces?

MegaIng · December 31, 2023, 11:39am

And the other answers provide exactly that, no? I don’t really see a point in repeating it a third time if you are just going to ignore it.

mdickinson · January 1, 2024, 1:02pm

Others have already explained the general mechanism. In your specific example, excluding the special case of tabsize = 0 (which simply replaces each tab character with an empty string), the number of spaces added for each tab character is tabsize - 11 % tabsize, where tabsize is the value of the argument to expandtabs.

>>> for tabsize in range(1, 30):
...     print(f"{tabsize} >> {tabsize - 11 % tabsize}")
... 
1 >> 1
2 >> 1
3 >> 1
4 >> 1
5 >> 4
6 >> 1
7 >> 3
8 >> 5
9 >> 7
10 >> 9
11 >> 11
12 >> 1
13 >> 2
14 >> 3
15 >> 4
16 >> 5
17 >> 6
18 >> 7
19 >> 8
20 >> 9
21 >> 10
22 >> 11
23 >> 12
24 >> 13
25 >> 14
26 >> 15
27 >> 16
28 >> 17
29 >> 18

Explanation: your string can be analysed in sections: each of the first 4 sections starts from a tab stop, then has 11 non-tab characters, then a tab character. On expanding tabs, each tab character converts to the minimum positive number n of spaces so that 11 + n is a multiple of tabsize, and that number n can be calculated using the formula tabsize - 11 % tabsize.

abdolrahmanshokri · January 3, 2024, 5:35am

The best answer

Topic		Replies	Views
Help in Error I keep getting Python Help help	16	796	July 23, 2023
Python add or insert tab (\t) into list Python Help help	4	9745	June 8, 2023
Please help: Key Error: ' ' Python Help help	5	5102	June 9, 2021
Replace string with method argument Python Help help	3	353	August 20, 2021
Issues with printing the same string multiple times Python Help	2	5255	July 1, 2021

Report a possible problem with string methods

Related Topics