Doctests failing with Python-3.15.0b1

Rosuav · May 15, 2026, 5:57am

That’s because JSON disallows trailing commas, which is one of its enduring frustrations and reasons for there being so many extensions to JSON.

tim.one · May 15, 2026, 6:13am

What do trailing commas buy you in pretty-printed output? I understand why they’re of value in source code, but don’t get the attraction for readability. To the contrary, seeing a trailing comma in output makes me think “oops! looks like some value(s) got lost”.

Or, if for reasons that escape me they do aid readability, why are they produced only for multiline output blocks? Is it really the case that

[1, 2, 3]

is “more readable” then

[1, 2, 3,]

but

is less readable than

[
    1,
    2,
    3,
]

Have to say it looks arbitrary and inconsistent to me, toward no apparent good end.

hugovk · May 15, 2026, 6:37am

Yes, it improves readability of diffs:

JamesParrott · May 15, 2026, 10:48am

This is probably naive and wildly optimistic, and has no reasonable place in a discussion about the stdlb.

But is it possible to make the doctest test runner, pprint aware?

Rosuav · May 15, 2026, 12:36pm

Consistency, which is a minor benefit and not one that I would argue strongly for, but it can be of some value.

Ah, that one I can definitely answer. The consistency argument only applies to a column of values. When they’re all strung out in a single line, there’s no value in the last comma.

People keep mentioning diffs, and I agree. It’s worth noting that this is NOT just a benefit for editing; I have a number of apps that save their configs in JSON, and I can pretty-print those and then git track them, gaining all the benefits of tidy diffs without hand-editing anything. (I do usually need post-process with a simple pretty-printer, and sometimes more ^[1], but after that, it’s a simple thing to commit that.)

OBS, I’m looking at you ↩︎

ncoghlan · May 15, 2026, 5:17pm

A fair point. I also realised that for the doctest case, the relevant setup override is using some variation of pprint = partial(pprint, ...) to restore the old default behaviour, as long as the options exist to turn off any new behaviour.

Alternatively, revisiting the “new name” idea, perhaps the new behaviour could be published as from pprint import print_code (or print_source), emphasising that the default display behaviour aligns with the general conventions for formatting Python source code.

Edit: as far as actually using doctest goes, I’ve relied heavily on builtin repr outputs staying consistent, as well as json.dumps output, but it never even occurred to me to base tested output on pprint. Those were thus the alternatives I had in mind when suggesting folks might be able to decouple their doctests from the details of pprint formatting.

tim.one · May 15, 2026, 7:07pm

Not for old releases, and it grates against doctest’s WYSIWYG soul regardless. doctest has always supported per-test options (in structured comments on the current test case) to overlook “simple” differences when desired, like

:“# doctest: +NORMALIZE_WHITESPACE”
to collapse all stretches of whitespace in the expected output to single blanks
“# doctest +ELLIPSIS”
to treat “…” in expected output as matching any string of 0 or more characters. For example, to ignore memory addresses sometimes embedded in Python str/repr representations. Like

>>> object()
<object object at 0x000002396D0E0DB0>

Rewrite as

>>> object() # doctest:: +ELLP:SIS
<object object at ...>

and it works fine.

But very intentionally not global settings. They have to be explicitly asked for on each test that needs them. WYSIWYG, apart from locally and explicitly requested exceptions on a test-by-test basis.

But the variations they allow for are simple-minded, and no attempt is made to do anything fancier than character-level straightforward transformations. “Allow an optional comma at the end of a container object’s display” is far beyond its scope.

If push came to shove, I’d be sorely tempted to check in a copy of the current pprint inside doctest, and monkey-patch the pprint module to use that instead for the duration of a doctest run,

But then doctest couldn’t be used to vet new behaviors.

A solution to all is simple: if people want new behavior, fine, ask for it, via using a new function name. “In the face of ambiguity, refuse the temptation to guess.”

tim.one · May 15, 2026, 7:40pm

Thanks to all for elaborating on “the reason” for trailing commas on multiline displays of container objects! It never occurred to me that people would feed the output of pprint() to diff tools. But that’s perfectly reasonable, and should be supported, in some way. The new behaviors serve that end, but not a big enough win (IMO) to justify breaking other uses. Use a new function name if that’s what you want.

This happened before. pprint() output intended to be more stable than builtin str/repr for dict output. Before “ordered dicts” were introduced, nothing was defined about the iteration order of dicts.

pprint() addressed that by sorting (when possible) a dict’s keys, to force a specific iteration order for display.

But when ordered dicts were introduced, the iteration order did become defined, and for some apps it was important to preserve insertion order in displays.

But pprint() didn’t change. Backward compatibility ruled. Instead a new optional sort_dicts=True option was added, and a new function (pp()) was added with a default of False instead.

That’s the best way to do it. Unless they’re fixing bona fide bugs, new behaviors should be explicitly opt-in.

As above, in the case of dicts, pprint() intended to be more consistent than str/repr output, It serves different purposes for different people, and people tend to dismiss the importance of use cases they don’t personally have.

I’m prone to that too, of course, but I don’t want to force “one size fits all” defaults on anyone, at least not in the core distro.. Breaking 30 years of what was reliable behavior for display of the ultra-high-use container objects (list, dict, tuple) requires, IMO, far strong justification than has been presented. Nothing against the new behaviors, and wholly support making them possible. But not by changing the default behaviors.

Rosuav · May 15, 2026, 11:20pm

Personally, I would be completely fine with a split of “pprint is stable, pp may change from version to version”. Then use of pp() in a doctest would always be a bug, and it would be allowed to change its defaults as needed; OTOH use of pprint() in a doctest should be stable, with the possibility to change things by adding new options (which would always default to compatibility).

tim.one · May 15, 2026, 11:52pm

I almost would be fine with that too. The nit is that pp() was added 7 years ago (Python 3.8) with no such caution. No idea how much use it’s seen (except that I’ve never used it), but the safest course (for backward compatibility) is to leave it alone too. The idea of some similar function that explicitly says format details are subject to change across releases (even point releases?) is golden, though.

It’s not true, e.g., that all users of pprint() had to suffer with “too narrow” indentation. Those who wanted 4-space indents were free to ask for them via the indent= argument, added in Python 2.4. To my eyes, that’s the biggest “improvement” in the new scheme, but was already possible. Whether a diff can be cut by a line or two at “the start” or “the end” of a container display is a comparatively minor convenience.

Rosuav · May 16, 2026, 12:31am

Sheesh, it was that long ago? I didn’t realise! Then yeah I guess it may be too late for that one, which would require a third function. Not so much a fan of that. Or maybe this can be considered a semi-breaking change, with the recommendation “if the format matters to you, switch from pp to pprint”.

barry · May 16, 2026, 5:44am

Only sorta kinda. I always put this^[1] in my conftest.py:

from doctest import ELLIPSIS, REPORT_NDIFF, NORMALIZE_WHITESPACE

DOCTEST_FLAGS = ELLIPSIS | NORMALIZE_WHITESPACE | REPORT_NDIFF

pytest_collect_file = Sybil(
    parsers=[
        DocTestParser(optionflags=DOCTEST_FLAGS),
        PythonCodeBlockParser(),
    ],
    pattern='*.rst',
    setup=namespace.setup,
).pytest()

So now they’re effectively global, but localized to my tests and no mucking about with thread-locals or whatnot.

(I use Sybil and only put my doctests in .rst files, never docstrings) ↩︎

barry · May 16, 2026, 5:51am

+1

I really think that for 3.15 we have to revert the change and keep backward compatibility. It’s too late for 3.15 and we can take our time to design a better solution for 3.16.

tim.one · May 16, 2026, 5:58am

Barry Warsaw:

Only sorta kinda. I always put this in my conftest.py:

from doctest import ELLIPSIS, REPORT_NDIFF, NORMALIZE_WHITESPACE

DOCTEST_FLAGS = ELLIPSIS | NORMALIZE_WHITESPACE | REPORT_NDIFF

pytest_collect_file = Sybil(
...

Progress, I guess . I don’t believe I implemented any of that. Because I’m opinionated, and doctest has a specific POV. In particular, I would not have added an “ndiff” option at all: the focus is so heavily on “human readability”, and it your expected output is so large that you can’t see differences “by eyeball”, it’s probably a poor test on that count alone.

Despite that I wrote ndiff too .

OTOH,. now that I know such on option exists, I’ll also use it at times . I’m not married to my own opinions either ..

barry · May 16, 2026, 6:04am

It’s funny because as I looked in one of my repos to remind myself what flags I use, I had to go back and remember what REPORT_NDIFF and then think about why I used it. I concluded that it’s basically just cargo-culted from my early doctest experiments in Mailman ages ago, where I was kind of using doctests in the wrong ways. . Although maybe not for Mailman, where I often had to compare huge flattened email representations where a failure ended up being in a boundary line in a nested nested nested attachement. Or something.

tim.one · May 16, 2026, 6:42am

There’s just no way to know in advance, alas. Backward compatibility isn’t something language enthusiasts care much about. The audience is more people who just use the language in their work, just another tool (of many) to them. We rarely hear from them until we annoy them with “busy work” they didn’t ask for and don’t care about. Even then they’re unlikely to complain (what’s the point? they’re used to being ignored )

Nobody wants to drop/delay other stuff they’re working on to deal with changes forced on them by others. “Fine, don’t move to a new release then!” isn’t a good response.

Python is far from the worst in this respect, but also far from the best. For example, Java is (in?)famous for never removing/changing anything about existing behaviors. Java code you wrote 15 years ago will almost certainly work fine today.

Python has “deprecation cycles” that core devs understand, but seem to get lost on the way to end users. Even point releases have a reputation for breaking code that worked fine - nobody in a large org treats upgrading to a Python point release as a no-brainer. For a major release, forget it. Mountains of code rely on 'dead batteries" that were removed, and changes to the C API mean major extensions may need months to catch up. This includes undocumented changes to conceptually private API functions - the “need for speed” often tempts extension authors into using the fastest paths they can find, “advertised” or not.

End users don’t know much about any of that. All they know for sure is “my code worked yesterday, but the massive Python-adjacent ecosystem I rely on doesn’t work anymore”. The naturally blame Python - because it’s the only part that actually changed.

Fair? Of course not. Real life? Yup .