Possible bug in str.count

Teut2711 · January 5, 2025, 10:09am

Python 3.12.1
>>> s = "adada"                                                                                             
>>> s.count("ada")                                                        
>>> 1

Is this expected behavior?

BrenBarn · January 5, 2025, 10:12am

Yes, because the documentation says (emphasis added):

Return the number of non-overlapping occurrences of substring sub in the range [start, end].

ronaldoussoren · January 5, 2025, 10:12am

Yes, the result is the number of non-overlapping occurrences.

Teut2711 · January 5, 2025, 10:14am

Thanks, but why is it made so? Like the expected answer by common logic should be 2. What’s the motivation behind such design? Also KMP would also give 2 in O(n + m). Is there another method for count with overlap? This can cause serious bugs.

blhsing · January 5, 2025, 10:18am

The typical workaround is to use a lookahead regex search instead:

import re

assert len(re.findall('(?=ada)', 'adada')) == 2

Teut2711 · January 5, 2025, 10:20am

Thanks for that though my question is not regarding how we can do it but rather why is such design there.

barry-scott · January 5, 2025, 10:39am

That depends on the question you ask.
If the question is “how many occurrences can be removed?”
Then 2 is the wrong answer.

chepner · January 5, 2025, 4:53pm

Only if you disregard the documented behavior. If you expect overlapping matches to be counted, you need to look for a different function.

Teut2711 · January 7, 2025, 4:45am

Thanks, I never looked from the following perspective.

franklinvp · January 7, 2025, 10:15am

One can remove simultaneously or sequentially. Both interpretations would still fit.