String concatenation with a for loop

Hello everyone,

When I run this program on Python 3.12.6, debian trixie:

# Part 1
def test():
    s = 'a'*1_000_000
    t = ''
    for c in s:
        print(len(t), id(t))
        t += c

test()

##############################################
# Part 2

s = 'a'*1_000_000
t = ''
for c in s:
    print(len(t), id(t))
    t += c

part 1 is much faster than part 2. It seems that in part 2, Python copies the string at each step.
What is the reason for this phenomenon?

Thanks !

My guess is that its the cost of access to local variables, part 1, compared to the cost of access to global variables in part 2.

In both cases the string manipulation of t += c is the same.
Note that each time you call t += c python actually does t = t + c that creates a new string to replace t every time around the loop.

CPython actually does specific optimizations for string concatenation that means it tries to reuse the original object and avoid copying. I suspect this doesn’t always work, but even if it only works sometimes it can be a huge difference.

This can only work if the reference count of the object is exactly 1. I suspect that with the changed behavior of global variable accesses compared function locals this optimization is also prevented, in addition to what @barry-scott already mentioned of global variables being slower to access than function locals.

Please don’t just say “much faster” but say how much faster. And don’t print in there.

Explanation:

2 Likes

It is not related to strings, but rather to the speed of accessing a local variable versus a global variable.

import time


def bench():
    a = 0
    for i in range(1000000):
        a += i

    return a


start = time.time()
result = bench()

print(time.time() - start)
print(result)

start = time.time()
a = 0
for i in range(1000000):
    a += i

print(time.time() - start)
print(result)
0.06985592842102051
499999500000
0.1436767578125
499999500000

@elis.byberi Did did you have a look at the GitHub issue with the explanation? And why did you measure ints instead of strings? That doesn’t show the huge difference.

0.03 seconds for part 1 versus 15 seconds for part 2.
The prints highlight the fact that the string t is copied at each step in part 2, while this is not the case in part 1.

The link given by Stephan answers my question, thanks !

1 Like

String: global vs local

import time


def bench_local():
    s = 'a' * 100_000
    t = ''
    for c in s:
        t += c

    return len(t)


start = time.time()
result = bench_local()

print(time.time() - start)
print(result)

s = 'a' * 100_000
t = ''


def bench_global():
    global s
    global t
    for c in s:
        t += c

    return len(t)


start = time.time()
result = bench_global()

print(time.time() - start)
print(result)
Python 3.10.12:
0.009384393692016602
100000
0.36257052421569824
100000

Python 3.12.7:
0.009612798690795898
100000
0.3437345027923584
100000

Local: string vs integer

import time


def bench_string():
    s = 'a' * 1000000
    t = ''
    for c in s:
        t += c

    return len(t)


start = time.time()
result = bench_string()

print(time.time() - start)
print(result)


def bench_integer():
    a = 0
    for i in range(1000000):
        a += i

    return a


start = time.time()
result = bench_integer()

print(time.time() - start)
print(result)
Python 3.10.12:
0.08484053611755371
1000000
0.06541633605957031
499999500000

Python 3.12.7:
0.06548666954040527
1000000
0.05298948287963867
499999500000
1 Like

Ah right, for that it’s good. Just not for timing.

That doesn’t seem to cause a significant slowdown. In Python 3.10, there are no copies, but the slowdown remains the same:

# Part 1
def test():
    s = 'a' * 1_000_000
    t = 'b' * 1_000_000
    for c in s[:10]:
        print(len(t), id(t))
        t += c


test()

##############################################
# Part 2

s = 'a' * 1_000_000
t = 'b' * 1_000_000
for c in s[:10]:
    print(len(t), id(t))
    t += c
Python 3.10.12:
1000000 126645204770832
1000001 126645204770832
1000002 126645204770832
1000003 126645204770832
1000004 126645204770832
1000005 126645204770832
1000006 126645204770832
1000007 126645204770832
1000008 126645204770832
1000009 126645204770832
1000000 98116411510800
1000001 98116411510800
1000002 98116411510800
1000003 98116411510800
1000004 98116411510800
1000005 98116411510800
1000006 98116411510800
1000007 98116411510800
1000008 98116411510800
1000009 98116411510800

Python 3.12.7:
1000000 129595043692560
1000001 129595042689040
1000002 129595042689040
1000003 129595042689040
1000004 129595042689040
1000005 129595042689040
1000006 129595042689040
1000007 129595042689040
1000008 129595042689040
1000009 129595042689040
1000000 37507376
1000001 38507440
1000002 37507376
1000003 38507440
1000004 37507376
1000005 38507440
1000006 37507376
1000007 38507440
1000008 37507376
1000009 38507440

@elis.byberi In your benchmark, you used a function with global. In your id logger, you didn’t.

I don’t quite understand what you mean. Could you elaborate?

This:

def bench_global():
    global s
    global t
    for c in s:
        t += c

Vs this:

s = 'a' * 1_000_000
t = 'b' * 1_000_000
for c in s[:10]:
    print(len(t), id(t))
    t += c

Try the latter with global t in a function like in the benchmark. What output do you get?

It shows a similar output for both Python 3.10 and Python 3.12, but does not affect timing.

The reason is implementation-specific, and of primarily academic interest. Python provides no guarantee that something like

for c in s:
    print(len(t), id(t))
    t += c

won’t take quadratic time. It does guarantee that

for c in s:
    print(len(t), id(t))
t = ''.join(s)

will take linear time, so avoid repeated concatenation.

better locals versus non locals

import time
s = 'a'*1_000_000

def testo():
    t = ''
    def testi():
        start = time.time()
        nonlocal t
        for c in s:
            t += c
        print(str(time.time() - start))
    testi()

testo()

I think :melting_face: