Claude Code – how much hype, how much true wizardry?

A simple example to show that they’re not just aping human language. Here’s a simple regexp:

haystack = re.compile(r"\d+\s+")

Matches at least one digit followed by at least one whitespace character.

Apply it to strings of the form:

needle = "123" * N

It can’t succeed because there’s no whitespace in the needle.

.match() fails “almost instantly”, in time linear in N.

But .search() takes time quadratic in N to fail. Essentially because it starts all over again at every starting index.

There’s nothing “pathological” about the regexp - it’s very simple and well-behaved. It’s the higher level strategy search uses that drives it. Note that N has to get will into the thousands before this is noticeable. In “catastrophic backtracking” cases - which take exponential time to fail - massive slowdown becomes apparent with N in the dozens.

My first thought was “duh - use a possessive quantifier - \d++”. Which does help. Cuts search time-to-fail about in half, but it’s still quadratic time.

There’s very little discussion of this kind of thing I’ve seen anywhere. Is it possible to change the regexp so it’s always worst-case linear time?

ChatGPT answered at once: sure! Just start it with a caret (“^”), effectively turning search() into match(). But that’s incorrect. It should match needles like "5-345 ". which require starting at index 2.

If did eventually solve it, but I’m pretty sure it didn’t find any “canned solution” in its training data. It instead showed every sign of “thinking”, reasoning about how the implementation worked, trying various dead ends until it hit on one that worked (all in less than a minute).

Me? I didn’t solve it, although I didn’t try very hard. @stefan2 solved it by adding 7 characters to the start of the regexp. ChatGPT also found that one - but then cut it to adding just two new characters at the start. It gave every appearance of “knowing” what it was doing, far deeper than just constructing grammatical sentences.

Its parting comment:

Just so.

2 Likes