PEP proposal: Automatically Formatting the CPython Code

gpshead · November 3, 2020, 5:41am

I just wrote a tool that could help allay anyone’s fears that an auto-formatter might break code. It compares the generated bytecode before and after, ignoring non-logic changes such as the line numbers.

It could use code review eyeballs. It needs unittests. I doubt I’d ever merge the PR into cpython; it is probably better off as its own PyPI project.

This PEP and some related auto-formatter testing needs at work led me to start thinking about validating results which led me to create this.

rhettinger · November 3, 2020, 7:43am

I look at a lot of existing code and new PRs. AFAICT there is almost zero need for automatic code reformatting. To the extent there are readability problems, it is almost always due to poor code organization, bad variable names, unnecessary nesting, and misfactoring.

ISTM this PEP is a fun-to-write solution looking for a real problem to solve.

IIRC, Guido has said at one time that all PEP 8 auto-formatters are not PEP-8 compliant because PEP 8 was intentionally designed around human judgment calls. For example, it is hard for an autoformatter to know that sqrt(b**2 - 4*a*c) or a*x**2 + b*x + c have correct spacing. Tools like Black typically undo such efforts and makes the code worse rather than better.

pf_moore · November 3, 2020, 7:46am

This is one of my biggest problems with auto-formatters. Subprocess calls with command line options are a classic example here:

subprocess.run([
    sys.executable, "-m", "pip",  # Logically, "run pip"
    "install",  # The subcommand
    "-upgrade",  # Independent flag
    "-c", "constraints.txt",  # An option with an argument
    "foo", "bar", "baz",  # The packages being installed
])

An auto-formatter would mangle this example - it has no way of knowing the logical structure of the pip options, so cannot judge what should go together.

Adding #fmt: off and #fmt: on are just distracting clutter for an example like this.

isidentical · November 3, 2020, 9:04am

Cool idea, one question though. What is the advantage of comparing bytecode over comparing the ASTs? Isn’t it would be much more simple, and since the bytecode is deterministic, the same AST would point to the same bytecode (not the other way around though).

steve.dower · November 3, 2020, 9:15am

This is what I’ve observed as well. I don’t remember the last time I held up a PR because of formatting alone - there’s always been something more important that requires manual review to identify.

It’s not hard to politely ask for minor formatting changes anyway, and if someone does find it hard, it’s a skill that can be learned (or taught by a mentor).

I’d much rather see us build up each other to be better “senior developers” than point a bot at our contributing “junior developers”.

taleinat · November 3, 2020, 10:33am

This really isn’t. I think all of the authors would better enjoy coding or writing a PEP for some cool new feature. This came up naturally during the sprint when discussing how to make things better for new contributors. Before writing a draft PEP, we initially got positive feedback from several other core devs who all mentioned how great it is working on codebases which have adopted such practices.

Indeed, there’s a link to that quote in the draft. We’ve been careful not to frame this as “automatic PEP 8 formatting”, and not for semantic reasons. This is not about PEP 8; this is about not having to waste time and effort on code styling. You yourself gave a whole talk about how people tend to overly focus on code style and how that can be harmful, which I watched when this was just getting started, and sent everyone a link to.

taleinat · November 3, 2020, 10:47am

It can be frustrating for contributors; we’ve provided links to several examples in the draft. That’s not a good enough reason adopt auto-formatting by itself, but it does show that asking for minor formatting changes does have a cost.

It also requires core devs, when reviewing PRs, to check things like line lengths, which I think we can agree is not a great use of their (our) time.

FWIW we also discussed mentoring during the sprint, and I’ll be posting some results of that soon.

pf_moore · November 3, 2020, 10:57am

Honestly, I’d interpret the idea that we shouldn’t spend too much time focusing on code style as meaning that we simply don’t bother checking line lengths. Unless the code is unreadable, or otherwise needs fixing because it has maintainability issues or is confusing, then just let it go - it’s not that important that we should waste core dev time or frustrate new contributors by mechanically demanding conformance to a set of essentially arbitrary rules.

Even black (to take an example of a particularly opinionated formatter) allows a level of flexibility around line lengths.

Very strong +1 on this. “Consistent formatting” is a valuable characteristic in large projects, that we should be encouraging people to learn and apply. “Thoughtless adherence to a set of mechanical rules” isn’t.

steven.daprano · November 3, 2020, 11:27am

Consider the motivation. Tal wrote:

“”"

At the time of this PEP’s writing, there are over 1,300 open pull

requests (“PRs”) on the CPython Github repository [2], and over 3,000

open bug tracker issues with a patch or a PR [5]. This large “backlog”

has been acknowledged as a serious problem, as can be seen, for example,

by the Python Steering Council discussing this in November 2019 [13].

“”"

Is that a large number for a project the size of CPython?

How does that compare to other projects of similar size?

As a very, very rough measure of issues per project complexity:

the black project has 326 open issues per megabyte;
numpy has 260 open issues per megabyte;
Pandas has 362 open issues per megabyte;
and CPython has 165 open issues plus PRs per megabyte.

(I took the number of open issues on the project’s Github repository,

and divided by the size of the downloaded zip file as a rough measure of

complexity. For CPython, I used the number of open issues plus the

number of PRs. There is probably some significant double-counting by

doing so.)

Julia has 3119 open issues; Ruby has 1711; Perl has 1935; D has 4511.

I am not seeing evidence that CPython is doing worse at managing open

issues than other projects.

Tal continued:

“”"

The most significant reason for this is a lack of core developer time to

review and handle these patches and PRs, as noted in the Python

Developer’s Guide [6] and many times on the python-dev mailing list [7]

and discuss.python.org [8].

“”"

Is code formatting the bottleneck here? You talk about lack of developer

time, but are they spending unreasonable amounts of time reviewing the

code formatting of PRs?

If not, then this motivation is spurious.

If code formatting is not a bottleneck for reviewing PRs, then how will

automatic formatting help lower the number of PRs that need to be

reviewed, or speed up the rate that PRs are dealt with?

If this gets accepted, what metric do you propose we use to decide

whether the move to automatic code formatting was helpful or not?

Tal again:

“”"

Besides making the PR process more efficient, adopting auto-formatting

would “lower the bar” for new contributors. The current workflow

requires contributors to read PEP 7, PEP 8 and the documentation style

guide [14], and to understand the subtleties of when to apply them or

conform to the style of existing code.

“”"

Nothing prevents us from suggesting that new contributors might choose

to use a code formatter like black or autoPEP8. We could add that to the

Dev-Guide right now, possibly without even a PEP, link to the two or

three most popular code formatters. None of that requires it to be

mandatory.

“”"

Finally, constantly thinking about code style is a distraction from

more significant aspects of code, such as correctness and readability.

“”"

That’s an odd thing to say. I almost never think about code style, but

when I do, it is always because I am concerned about readability.

Mindlessly formatting code is certainly ripe for automation, but it’s

also something that many developers do unconsciously. I don’t think

about formatting the 90 or 95% of easy cases, it just happens without

much or any conscious thought (especially if my editor has good

auto-indent).

But it’s the remaining 5 or 10% of cases where I don’t want to have to

fight the auto-formatter precisely because I care about the readability.

pitrou · November 3, 2020, 1:24pm

Like @stoneleaf I really dislike black's output. I’m also unconvinced that code format requirements are actually a hurdle when contributing to a codebase such as CPython.

benjamin · November 3, 2020, 5:43pm

Focusing on the value for new contributors undersells automated code formatting. Everyone benefits from removing low-value formatting work from their coding and reviewing. Discussing the actual behavior of code is strictly more substantive than discussing its formatting.

Certainly, everyone can find something to dislike about any automated formatter. But, overall automated formatting is freeing. A major purpose of software engineering is to make machines do repeitive, dull tasks rather than humans. Code formatting is such a task. Automated formatting is a tradeoff that values achieving optimal whitespace beauty, which is subjective anyway, below spending brain time on more interesting topics.

gpshead · November 4, 2020, 2:13am

Checking… It looks like the constant folding I was imagining wouldn’t have happened at the AST level actually has. So comparing an AST also seems like a valid approach.

pganssle · November 4, 2020, 6:27pm

I was mildly skeptical of auto-formatting before, but now that I’ve started to use it regularly, it’s really hard to continue contributing to projects that don’t have it. It’s very freeing to just write code where I’ve messed up the indents or accidentally added trailing whitespace or used inconsistent quotation marks or something and then let the auto-formatter handle it.

I agree with Benjamin that the benefit for new contributors is underselling the value here. I think that at least some flavor of this (incremental, customized, etc) is worth doing even though I don’t really think that code formatting issues are a significant problem for our relationship with contributors.

As a convert to auto-formatting, I will say that one problem I have is that I think part of the clash here is that it’s very hard to reliably signal to auto-formatters that you want something to be formatted a little differently. It’s very all-or-nothing. 99% of the time, I’m fine with what black or yapf or whatever produces, but in the cases where it produces something hard to read, I have to disable all auto-formatting in that section to customize it. Obviously if the choice is between “no auto-formatting at all” and “auto-formatting except where you disable it”, then it’s pretty close to strictly better to have the auto-formatter, but it would be nice if I could tell an auto-formatter, “I’m going to align these list elements myself, but please do all the other stuff you were going to do.”

Maybe having an easier out than disabling all formatting in a section would help? (Bonus points if it’s unobtrusive, though the unobtrusive solutions tend to be the ones that surprise people the most).

steve.dower · November 4, 2020, 10:47pm

I use Black on a number of my other projects, and everything else that you say is true. On this point, refactoring into more statements and more temporary variables seems to be the best way.

In my opinion, spreading things out more is good for readability anyway, compared to squeezing even more into a single line.

So if we’re happy enough with a big format changeset (on all branches) followed by a few refactorings to fix egregious sections, I’m okay with it too.

But we should vendor a copy of the formatter in Tools. Changing version over time is a real issue, so let’s pin in the safest possible way.

eric.snow · November 6, 2020, 3:34am

FWIW, I agree with the folks that are not in favor of autoformatting.

In fact, mostly all I see are the downsides. Dealing with auto-formatting will take up some of the little time I have for open source. (That may sound weird but my experience with Go, Typescript, and Black have taught me that I’ll have to regularly spend extra time to tweak code and tooling to get autoformatting to do the right thing.)

-eric

gpshead · November 6, 2020, 5:05pm

Those sound like workflow problems. Those of us used to working in autoformatted codebases in favor of persuing this wouldn’t be proposing it if that was the norm and expected outcome.

For acceptance of such a proposal I’m assuming dealing with such workflow issues so they don’t add work for people would be part of the pep.

Ex: today I can create PRs that then go onto CI actions and fail various whitespace or formatting checks with high latency after my attention has shifted away. Doh! That is an antipattern. Devs shouldn’t have to wait for a CI system to block them to find out they’ve got a non-logic issue like that to clear up. So we’ve already got a bit of this hell. A goal of implementing and adopting autoformatting would be to reduce that kind of thing. Definitely not add more.

pf_moore · November 6, 2020, 6:10pm

So we add tools to mitigate the stumbling blocks that automated style checkers cause? Hmm.

brettcannon · November 6, 2020, 6:55pm

But it also goes the other way:

“So we don’t use tools to mitigate code review overhead because people won’t set up one more tool once when they are already using tools to write the code to be reviewed? Hmm.”

Or put another way: how are formatters that gate acceptance of a PR different than any other test we have in our test suite? Both are making sure our coding standards are upheld. You have to know how to run the test suite. If we ship the formatter with Python then there isn’t even an install step.

Take this out to documentation and we actually do have a requirement you install Sphinx as an external tool to make sure the docs you just wrote pass. Now we could choose to accept docs that are busted and Python would keep functioning, same with any warning coming from Sphinx. But we have decided we want warning-free, syntactically valid reST. How is saying we consider “properly” formatted source code any different than not wanting any warnings? Neither technically hurt things necessarily if they go in, but we are choose to care about it.

Remember, we already implicitly have blurb and sphinx as external tools to install and use. And you could argue that make patchcheck is already doing a very lightweight version of this, so I think we are past the burden of asking users to run tools.

Or we could always go back to manual downloads of patch files and running the test suite manually.

pf_moore · November 6, 2020, 7:59pm

Absolutely. Sorry for being a bit too terse and light-hearted. My point was really that this is all a balancing act, with solutions causing their own problems which again need solutions, etc.

IMO, the only thing that matters in this area is that the code is readable. And while that’s subjective, I strongly believe that badly formatted code is a case of “I’ll know it when I see it”. If you’re not sure, then it’s not badly formatted, it’s just using a style that may not be what you prefer¹. There’s a reason PEP 8 is framed as a set of guidelines, not as hard rules.

Personally, I’d much rather keep the current checks, and manually format the code. I don’t often violate PEP 8 when writing code by hand, and I’m fine if I do with either catching in in a linting pass (if I remember) or having CI catch it and manually fixing it². I’m not keen on auto-formatters because they force a specific layout choice out of many which satisfy PEP 8, and don’t give me any flexibility to express my intent. They essentially make it mandatory for me to run the formatter over the code - as otherwise, CI will do it for me, and the code that gets committed won’t be laid out as I wrote it. So there’s a new step for me of reviewing my own PR, to ensure that what I wanted to say is still expressed clearly now that the auto-formatter has modified it.

But I’ve already carried this on too long. The one constant in any style discussion is that debating what formatting rules and auto-formatters to apply always consumes way more resource than the final agreed solution will save. So I should do my bit and minimise the overall cost here

¹ Project style rules (even PEP 8!) may well mandate choices that aren’t what you prefer - and you still have to follow them. That doesn’t in itself mean that either your preference or the project standard is “badly formatted”.
² Although I reserve the right to use a sarcastic commit message when appeasing the linter

aeros · November 7, 2020, 9:54am

This seems like it could be a decent in-the-middle solution: have an optional PEP8 code style check CI added via GitHub Actions for CPython PRs. It could help with resolving general gray areas and guiding contributors, but leaves the core developer(s) that review and merge the PR with the discretion to ignore it in situations where readability is better without following the automated style rules.

Something like a make codeformat can be added for those who want to spend as minimal mental energy as possible on code style (I suggest for it to be separate from make patchcheck in case the auto-formatting is undesired).