One Sentence Per Line for PEPs (and more?)

CAM-Gerlach · February 24, 2022, 7:55pm

While this topic is pretty important to me, due to huge the amount of time, stress, cognitive load and sub-optimal content-relevant choices its saved me over the years in docs/website/etc repos that switched to OSPL, and the amount of the same it costs me every day as a PEP writer and editor with those that don’t, I was intending to wait to consider and potentially propose the detailed case for it (at least in the context of the PEPs repo) once I’d established some credibility as a PEP editor and in the community. Unfortunately, it seems my OT aside that sparked off @encukou 's the SemBr thread let the cat out of the bag, and on a day when we were dealing with a severe weather situation too.

Abstract

Presently, the arbitrary 79-character line length limit for prose text in PEPs imposes an outsize burden of mechanical effort and cognitive overhead on PEP authors, editors, and community members when writing, editing and reviewing PEPs, due to the need to constantly reflow text, its lack of compatibility with common tools and editing/reviewing workflows, and greatly increased line noise in Git diffs, blame and other output.

We propose recommending the One Sentence Per Line (OSPL) approach for prose text in new PEPs, which features line breaks after each sentence, with a 1:1 mapping of sentences to lines and a soft length limit (based on reasonable sentence length) instead of a hard one.

By matching the physical structure of the file to its logical hierarchy, this is a straightforward, simple-to-implement solution to the all problems above, without the much greater complexity for both humans and machines to understand, implement and enforce introduced by “semantic breaks” or other more complicated approaches.

Motivation

At present, a maximum line length of 79 characters, and a minimum length of 70 characters, is nominally required for PEPs. However, many PEPs don’t keep to this nominal length limit for all lines, and even fewer are consistent about following the minimum. This lack of consistency can lead to confusion about conventions and expectations by PEP authors and dilutes the value of a nominal standard. Likewise, there is wide inconsistency in separating sentences; some by one space, some by two, and others by a line break.

However, such relatively modest issues are but the tip of the iceberg of the problems hard breaks cause for prose text. These include:

A substantial burden of time, tedious effort and stress on PEP authors, editors and reviewers to constantly reflow the physical text to fit a 79 column limit whenever changes are made, or else try to coerce their prose to conform to these arbitrary restrictions.

Given that soft wrapping is readily available in all modern editors/IDEs, as well as in most common contexts on GitHub for prose documents, whereas tools to automatically reflow reST (or in many cases, even indicate the hard-wrap width) are not, this does not seem to still be justifiable.
As far as the authors are aware, only the RST extension for Emacs offers full support for automatically reflowing reST prose. Per the latest Python Developers Survey 2020, the most recent as of this writing, only 2% of Python developers currently use Emacs. Therefore, it is likely that most PEP contributors do not regularly use an editor with an extension that reliably supports this.

Furthermore, regardless of a contributor’s choice of primary editor/IDE, a substantial amount interaction with PEPs typically takes place outside of it, such as on GitHub, where soft-wrapping is usually available (at least on desktop), but even minimal assistance for hard wrapping (e.g. a column count, or a vertical rule at 79/80 characters) is not.
It increases the difficulty in parsing and manipulating the text, as the physical units as seen by the editor do not align with the logical units to be manipulated (phrases, sentences or paragraphs).
It also makes it easier to make many classes typographical and editing mistakes at hard-wrap boundaries, and substantially more difficult to catch them.

In particular, repeated, missing, extraneous and non-matching words/phrases can be present on both sides of an arbitrary hard break, as can happen during editor or reflowing, but require a careful eye to catch.
For example, on python/peps#2164, multiple passes of manual and regex searching discovered and fixed many such mistakes, but some were only caught by a careful read-through by reviewers, and likely others yet not detected. Additionally, the reflowing process itself results in much more frequent mechanical edits, increasing the probability of mistakes and requiring more frequent and extensive proofreading passes to be assured of the text’s quality.
Git/GitHub diff, blame, patch, review and suggestion output is much more noisy and less meaningful.

As the basic unit of text considered by Git, the line, has no syntactic or semantic meaning in such files, and even a relatively small change to a single line can propagate to entire paragraph. This results in git diff and git blame output, and their GitHub equivalents (particularly on pull requests), alongside patches and GitHub suggestions, being long, noisy and difficult to read and parse for the actual meaningful changes. It also interferes with GitHub's word-level change highlighting, making it much less useful.
Arbitrarily split lines make review comments and suggestions significantly more difficult and less useful, with a higher chance of merge conflicts.

Multi-line comments/suggestions, which are more tedious and don't always work as desired, must used far more often even for small changes in a single sentence, as not only will they often span multiple lines (often with deleted lines in the way, that block suggestions completely), but also require reflow of many succeeding ones, or even an entire paragraph, which is often not possible at all, and otherwise must be done manually with no aids. Finally, given this results in more lines being changed, it also increases the probability of merge conflicts even between logically independent changes.

Rationale

The One Sentence Per Line (OSPL) structure for prose text in new PEPs, coupled with relaxing the nominal hard line length minimum and maximum with soft limits based on reasonable sentence length, is a straightforward approach to neatly resolve the above issues. Conceptually, it means a 1:1 mapping between physical lines and logical sentences, and practically, is simple to implement, as contributors would just use a line break instead of (one or two) spaces after each sentence. This makes writing, editing and reviewing text much easier and less painful by matching the physical to the logical structure of the prose.

In particular, for authors and editors, it:

Avoids the constant need for tedious, disruptive reflows upon most changes, which is a large drain on author, editor and reviewer time.
Makes it harder to commit and easier to notice many common writing and editing mistakes spanning multiple lines (e.g. repeated, missing, spurious or non-matching words).
Results in it being much easier to spot sentences that are too long, too short or vary too much in length, as well as repetitive/redundant patterns or breaks in valid ones.
Allows easily swapping, moving, commenting and deleting sentences with the commands and shortcuts in nearly any common source editor, and makes it simple to split and join paragraphs.
Neatly sidesteps debate over one versus two spaces after sentences, while having the benefits of both.

It also benefits reviewers and Git workflow; specifically, it:

Enables shorter, cleaner and more logical and semantically meaningful diffs, blame and patch output, and allows GitHub’s per-word diff display to work properly
Allows reviewers to comment/make a suggestion on a sentence in one click, instead of using more involved and often flaky multi-line selections
Prevents having to constantly guess and check or roundtrip from an editor when making GitHub suggestions to fiddle with line length
Reduces the chance of merge conflicts by localizing changes to the specific sentence(s) modified

This could affect readability in certain cases for a minority of contributors, if using limited viewing/editing interfaces that still lack support for soft wrapping (such as parts of GitHub’s mobile interface). However, it is a substantially lesser impact than the status quo, which ultimately harms usability for nearly all editors, interfaces and tools (aside from those specifically supporting parsing and reflowing reST syntax, which is a very small fraction).

We also carefully considered alternates besides OSPL and the status quo, including paragraph- and semantic breaks, but in our analysis and experience, neither offers the same balance of both conceptual advantages and practical usability as OSPL.

Specification

In English prose portions of source files formatted with a “One Sentence Per Line” (OSPL) structure, a single line break MUST be used after each prose sentence, and SHOULD NOT be used within a sentence, absent other markup which requires it.

For such files, a hard minimum or maximum character limit for column width MUST NOT be enforced for prose text, but to aid readability, sentences SHOULD be kept to a reasonable length (with ≈240-300 characters max as a rough guide).

The OSPL source structure is RECOMMENDED for new PEPs, and MAY be used for existing draft PEPs whose authors choose to adopt it. PEPs MUST be consistent about their choice of source structure.

OSPL MAY be used for prose text by other Python-related projects, such as documentation, should they choose to adopt it.

Backwards Compatibility

As this change does not affect the meaningful syntax of reST (or Markdown) files, nor the rendered output, there is no direct backwards compatibility impact. We could progammatically convert files from hard breaks to OSPL, but particularly on the PEPs repo, this is not really necessary, aside from authors who may wish to do so when rewriting their draft PEPs. Rather, future PEPs can simply be written using OSPL.

Security Implications

As this merely changes the line-breaking strategy, there are no conceivable negative security implications from this change.

How to Teach This

Teaching PEP authors, editors and reviewers to implement this is the simplest of all considered alternatives (including the status quo), aside from possibly paragraph-only breaks. Contributors need only to press Enter rather than (once or twice) Space after each sentence. That’s it, and it can be relatively reliably automated with a fairly straightforward regex-based script.

Unlike hard breaks, is no need for contributors to install and set up an editor and appropriate plugin(s) to hard-wrap reST (if one is even available for the users’ platform); nor configure an editor to display a vertical rule at the indicated column length, learn what constructs in reST can be broken by spaces, and then remember to apply them; nor copy and paste back and forth from one in interfaces (e.g. GitHub).

Likewise, unlike semantic breaks, there is no need to teach contributors (particularly those for whom English is not their native language) to constantly recall and parse the technical details of English grammar as to the difference between dependent and independent clauses, and remember and worry about circumstances that may or may not merit a line break, and the various other nitpicky rules of sembr.

While paragraph breaks are perhaps even more “natural”, by using a space rather than a line break after sentences, they do create the potential for uncertainty and inconsistency with the issue of one versus two spaces after sentences, which OSPL neatly side-steps.

Reference Implementation

OSPL requires no particular build system, linting or other changes to implement on the PEPs repository (or presumably elsewhere). However, the authors propose developing, as part of this proposal, an autoformatter to automatically conform (new) PEPs to this style, on an opt-in basis, which the authors have already experimented with locally. Such a formatter would leverage the existing pre-commit framework already used on the PEPs repository for local automatic, on-demand and CI based linters and fixers. With such, the formatter would be automatically installed, executed and updated via pre-commit cross-platform with a single one-time setup command (along with the existing linters), and could be run both on-demand and automatically with each commit, with no additional user effort, as well as optionally on the CIs (for opted-in PEPs).

Such a formatter could not only apply the OSPL style, but could also be extended to do so for the existing PEP 12 requirements, such as header underline characters, preceeding/following line breaks and other mechanical changes, greatly smoothing the process for both authors and editors/reviewers alike, much like the black autoformatted for Python code. In the future, it could even be configured and adapted for other Python prose applications, if desired.

However, the Spyder-Docs repository for the Spyder scientific environment and IDE, which the primary author is the maintainer of, serves as an example of a moderately large documentation project (and associated style guide) that has successfully adopted OSPL, along with the various other prose projects in the Spyder organization (including the Spyder website, website theme, docs theme, API docs, proposals and others).

Additionally, ASCIIdoc, another major documentation system, has adopted OSPL as a recommended standard.

If its adoption and use by the PEPs repository is successful, and critical issues are resolved or worked around, the PEPs repo in turn may serve as a reference implementation for other prose content in the Python community.

Open Issues

Would it make sense to extend this approach to the Python documentation and related projects (devguide, etc) as well? Or would it be wiser to just apply it to the PEPs repo initially, as currently proposed here, which limits the scope and allows it to be tested in a controlled, well-suited and minimally disruptive environment for possible applicability elsewhere?

Rejected Ideas

Hard wrapping

Hard wrapping is the status quo, the myriad of issues with which are described in the Motivation section, and summarized here.

+ The status quo; requires no nominal changes
+ Keeps physical line width narrow, improving usability on some platforms without soft breaks (e.g. GitHub mobile)
- Requires constant, usually manual reflows
- Less straightforward make common edits with standard tools
- Makes diffs, blame, patches and suggestions very noisy and less meaningful
- More difficult to use with Git/GitHub review workflow
- Easier to make many mistakes and harder to spot
- Inconsistent separation between sentences (1/2 spaces, or sometimes break)

Paragraph breaks

With paragraph breaks, line breaks are only used to separate paragraphs and syntax. The most “natural” style in prose writing, but despite being a nearly opposite approach to hard breaks, share many of the same downsides, and thus haven’t been seriously considered.

+ Most natural and easiest to learn (what untrained users type by default)
- Far longer line lengths than any other strategy, for cases where it matters
- Noisy, not very useful diff, blame and patch output
- Poor granularity with comments/suggestions applied on whole paragraphs
- Higher chance of suggestion and merge conflicts
- No clear separation between sentences

Semantic breaks

With semantic breaks (SemBr), line breaks are placed between semantically meaningful sub-components of sentences and other constructs. Conceptually, it is rather attractive for many of the same reasons as is OSPL, and it still is generally better (at least in theory) in most respects than “dumb” hard breaks in that the breaks have semantic meaning and the diffs are cleaner, while keeping line length shorter than OSPL (usually around or below the previous 80-character hard limit, so long as the clauses are a reasonable length).

However, “practicality beats purity”, and the practical downsides in actual use—increased tedious effort and cognitive burden, increased learning curve (especially for those who aren’t native English writers), difficulty of consistently enforcing, impossibility to check or automate with tooling, introducing choppy, poetry-like flow when reading, greater total vertical space consumption, more complex to review/suggest—weight strongly against it. From our experience, OSPL is a much better solution in practice, which seems to be the case for others as well, as the authors are currently not aware of any large projects that have formally adopting and consistently used semantic breaks.

In more depth, some of the practical difficulties with SemBr are as follows:

It results in a high degree of tedious effort inserting breaks everywhere, and even greater cognitive burden (having to constantly parse both the language syntax and semantics to determine what merits a break, rather than simply putting one after each sentence and being done with it)
It is less obvious to readers, and substantially more difficult to teach writers, especially the high proportion of those for whom English is not their first language, how to determine where the breaks go (they have to understand much more about the technical details of the English language, which is difficult even for many native English speakers)
It is practically much more difficulty to mandate or apply consistently, since it is a less simple, obvious and “natural” style than sentence-level breaks
Since it relies on details of English grammar and semantics, it is essentially impossible to check or automate with tooling
As others mentioned, the very short fragments are too short to smoothly read
It increases total length and vertical space consumption by several times
It is more complex to review/suggest, because often one wants to (or later decides they need to) modify more than one clause in a sentence, which does not play well with GitHub suggestions

In summary:

+ Writing and editing are less constrained than hard breaks (though more than OSPL)
+ Line length is as short (or shorter) than hard breaks
+ Diffs, blame, etc. are fairly meaningful and highly granular
+ Conceptually rather elegant, at least in theory
- Still a lot of tedious effort, and even more cognitive burden
- Less obvious to contributors and requires substantial effort to teach
- No clear, consistent rules could result in churn and bikeshedding
- Impossible to check, lint, or reflow automatically
- Review comments/suggestions still difficult due to multiple lines
- Can be difficult/choppy to read and gauge flow
- Longer file length and vertical space consumption

CAM-Gerlach · February 24, 2022, 8:28pm

Just to clarify, my position on SemBr is that it is generally an improvement over “dumb” hard wrapping, especially for reST where it is a semi- or completely manual process anyway, and it could make more sense for projects like the CPython documentation, in terms of being practically easier to adopt incrementally and non-strictly-enforced on existing, gradually-updated content that currently uses hard breaks, despite the practical downsides I highlight versus OSPL.

However, the case for OSPL is much stronger particularly for repos like the PEPs, where:

The benefits of OSPL are more acute (given the high, concentrated amount of rewriting, editing and review during the PEPs’ pre-draft and draft stage)
The difficulties in understanding, teaching and consistently enforcing SemBr come to the fore (since many authors are first-timers or don’t write PEPs regularly, versus a contributor base experianced in technical English writing)
The adoption issues are mostly moot (existing non-Draft PEPs stay are rarely edited and stay as they are, new PEPs can adopt it and Draft PEPs can if their authors choose to).

And as mentioned, this opens up the opportunity to trial it there, and then expand to others if successful and incorporating any lessons learned.

guido · February 24, 2022, 8:41pm

Um, I didn’t read all that (sorry), but can we at least make it opt-in for PEP authors? Not everyone uses the same tools, and personally I’m set in my ways.

stoneleaf · February 24, 2022, 8:45pm

What are the average line lengths in that project? I think OSPL has some merit, but I also know that if I’m reading a paragraph with lines longer than about 120 characters, my eyes have a hard time tracking to the next line correctly.

CAM-Gerlach · February 24, 2022, 8:45pm

I kept the actual specification section very short, just for you . It says:

So it is a recommendation, not a requirement if authors strongly prefer not to use it. What is required is that PEPs are consistent about their structure (so contributors to your PEPs would be expected to respect the author’s choice).

hugovk · February 24, 2022, 9:17pm

Please could you link directly to a file on GitHub that uses OSPL?

Does it suffer the problems shown in these screenshots where the editor has word wrapping off, or editing on GitHub with mobile?

CAM-Gerlach · February 24, 2022, 9:17pm

Indeed; having some background in design/page layout as well as technical writing, I’ve with you on the value of keeping the displayed output at a reasonable column with to aid (source) readability. However, at least in my experience, this can be accomplished for prose writing in the great majority of contexts without tediously hard-wrapping every line in the source file, by taking advantage soft-wrapping in viewers, editors and (of course) rendered output.

Typical sentence length in the Spyder docs varies, but is roughly 160 characters, with most sentences ranging from 80 to 300 characters (which, incidentally, is possible to even visually estimate due to OSPL). However, as mentioned, this isn’t really relevant for displayed width in most contexts; I typically soft-wrap my editors/IDEs around 80 to 120 characters, and GitHub, even when fullscreen, soft-wraps at around 140, assuming I’m reading the source and not the rendered output (which is always soft-wrapped, of course).

steven.daprano · February 24, 2022, 9:18pm

A couple of observations:

You don’t appear to have used “One Sentence Per Line” in this pre-PEP document. That hurts your credibility.
I think this is a clear case of Hammer Syndrome.

Git and diff are excellent hammers for working with line-based code. They are not screwdrivers for working with paragraph-based prose. Instead of forcing paragraph-based prose into a line-based format so that git/diff can hammer them better, perhaps we need better tools.

One of your paragraphs is:

“”"

As this change does not affect the meaningful syntax of reST (or Markdown) files, nor the rendered output, there is no direct backwards compatibility impact. We could progammatically convert files from hard breaks to OSPL, but particularly on the PEPs repo, this is not really necessary, aside from authors who may wish to do so when rewriting their draft PEPs. Rather, future PEPs can simply be written using OSPL.

“”"

Written in OSPL style, that becomes:

“”"

As this change does not affect the meaningful syntax of reST (or Markdown) files, nor the rendered output, there is no direct backwards compatibility impact.

We could progammatically convert files from hard breaks to OSPL, but particularly on the PEPs repo, this is not really necessary, aside from authors who may wish to do so when rewriting their draft PEPs.

Rather, future PEPs can simply be written using OSPL.

“”"

(Apologies in advance: Discourse has a distressing tendency to mangle text sent via email. It is possible that it may reflow the three lines above, or insert extra line breaks, or add blank lines, or even truncate my post.)

Those three lines have lengths of (approx) 150, 200 and 50 columns respectively. In my text editor, with word-wrapping turned off, the first two are essentially unreadable due to the need for horizontal scrolling.

With word-wrapping set to 80 columns, I get:

“”"

As this change does not affect the meaningful syntax of reST (or

Markdown) files, nor the rendered output, there is no direct backwards

compatibility impact.

We could progammatically convert files from hard breaks to OSPL, but

particularly on the PEPs repo, this is not really necessary, aside from

authors who may wish to do so when rewriting their draft PEPs.

Rather, future PEPs can simply be written using OSPL.

“”"

To me, that looks like missing blank lines between one-sentence paragraphs, not a three-sentence paragraph.

I acknowledge the difficulties you list with the status quo, although my experience is that the difficulties and annoyances are not quite as severe as your pre-PEP suggests. YMMV. But I wish for better tooling, not to change the way we write paragraphs to suit the tools.

Or perhaps I should say:

I acknowledge the difficulties you list with the status quo, although my experience is that the difficulties and annoyances are not quite as severe as your pre-PEP suggests.

YMMV.

But I wish for better tooling, not to change the way we write paragraphs to suit the tools.

(Again, I have no idea how Discourse will render those three lines.)

CAM-Gerlach · February 24, 2022, 9:44pm

Sure. From the Spyder-Docs repo I linked above, see, for example, spyder-ide/spyder-docs#296. In particular, see this sentance as an example of a longer one in which relatively few changes were made, but GitHub’s highlighting makes very clear what those are (similar to the example in your comment on the other thread). And for the opposite case, see this for an example of a sentance that was almost entirely rewritten, but GitHub still accurately shows what parts were and weren’t changed. With hard breaks, this won’t work well for the former, and is almost completely impossible for the latter. (Of course, if viewing the files themselves, rather than the diffs, they rendered and wrapped by GitHub anyway regardless of the source wrapping).

You might find my reply on the other thread of interest, as it addresses a lot of the points in the post you linked (which I’d been meaning to reply to), as well as the above.

If you deliberately toggle word wrapping off in your editor for reST files, then yes, sentences would be pretty long, but I don’t really understand why you would realistically want to do that. For prose text, making your editor pane 80 characters wide, or any width your eyes prefer, will produce the same effect as hard-wrapping at that specific character threshold without all the work and other costs of actually baking it into the file.

As for GitHub Mobile, I can’t speak to that as I’ve never used it (except to check rendered output in mobile browsers), but it seems like a GitHub bug. If that is the still a problem, it is unfortunate and hopefully will be fixed soon, but I’m not sure that specific edge case justifies the other continuing costs of hard breaks elsewhere—I would be curious to hear of others, though. And I’d certainly hope that others aren’t spending hours every day on mobile writing, editing and reviewing PEPs, as I do on desktop—I’d be worried about a lot of other barriers to usability in that case, though if would certainly appreciating hearing how common this is in practice.

pf_moore · February 24, 2022, 11:18pm

I strongly agree with this. This would not suit my workflow at all.

I use my editor with soft line wrapping switched off, so long lines require me to scroll horizontally.
I have a hard wrap configured globally, so long lines are hard for me to enter (I have to continually undo the auto-wrapping).

I’d be OK (not happy, but I could accept it) with a guideline that said “line breaks must occur at the end of a sentence, and may occur within sentences where they should be at meaningful break-points in the prose” (I’m not 100% sure if that’s precisely “semantic breaks”, which is why I avoided that term). I would like to retain a maximum line length, though, specifically so that people are not required to configure their editor to soft-wrap.

IMO, the source of a PEP should be written for ease of authoring, rather than for ease of copy-editing. I understand this makes things harder for the PEP editors, and that’s a pity, but IMO it’s already daunting to be asked to write a PEP for your proposal, and having a bunch of style rules that contradict “natural” prose style just makes things worse.

As a PEP author, and the packaging PEP-delegate, OSPL would make my job significantly harder. Semantic breaks (with a line length limit) would be a net disadvantage to me, but I could live with it if it improved things for others.

As a compromise, maybe PEP authors (who are responsible for changes to the PEP while it’s being developed) could be allowed to write their PEPs in whatever style they prefer, and once the PEP is marked as final, the PEP editors can do a reformatting pass over it, to migrate it to a semantic break style that suits subsequent edits (where the PEP editors presumably assume a greater responsibility over the text?)

I’ll stop at this point, because the more I think about this proposal in the context of writing a PEP, the more I dislike it, so I’m running out of constructive things to say

CAM-Gerlach · February 24, 2022, 11:19pm

Just to note, as I mentioned to @guido , as currently proposed above, OSPL is only a recommendation—if you do strongly prefer, you would still be free to write your PEPs in the old hard-break style, or with semantic breaks (as a subset of such), and we would respect that when editing it (per the MUST).

Indeed; as you predicted, it seems your post is indeed not rendered as you presumably intended; I will attempt to repair your examples as I quote them (unfortunately, I’m only TL3, not TL4, so I unfortunately can’t fix them in your original post). I do want to mention, as I believe several other community members here have also said, that other heavy Discourse email users (like Cameron Simpson) have their messages consistently display just fine, so you might want to double-check your source formatting and your email client configuration as well.

I wanted to, but unfortunately Discourse posts/comments do not follow the standard reST and Markdown conventions of ignoring single line breaks for the purposes of line wrapping. Therefore, it would thus change the rendered output rather than just the raw source, which is integral to the proposal itself.

I think we all wish we had better tooling, and can agree that in an ideal world, it would be the optimal solution to this problem, regardless of the approach we go with in the file itself. If all viewers, editors, IDEs and web UIs had reST-compatible automatic reflowing, and Git/our VCS of choice used a sophisticated algorithm and reST-specific logic to record and display changes in a way specifically tailored to prose content, and these tools were easily and universally usable for users across platforms, most of the issues with hard breaks cited in this PEP are no longer a real problem. Likewise, if all possible viewers, editors and web platforms had fully intelligent, user-customizable soft wrapping (and per-word diffing), the motivation for hard-wrapping goes away.

Unfortunately, we don’t live in an ideal world, as much as it often pains me (with the unexpected circumstances that motivated this thread, not to mention ongoing world events, being all too fresh reminders), and those tools would have to be discussed, designed, prototyped, developed, tested and deployed, if that’s even practical to do so—I don’t think its realistic to switch the PEPs repo to a different VCS and code hosting site, if one even exists that solves this problem, nor build reST-reflowing plugins for common editors (or require all users to use emacs with RST-mode). As it stands, I don’t think we should allow the hypothetical possibility of better tools to block us from taking a straightforward, practical step to improve the situation with those we have available.

The one part that be at least theoretically feasible to develop is a black-like reST auto-formatting system, that could run locally on commit and on the CIs and could include reflow, but aside from the need to develop it, without OSPL it only mostly solves one of the problems mentioned (need to manually reflow), and adds significant extra complexity and barriers for contributors and maintainers.

In summary, we all wish we had a custom-designed, 3D printed set of tools to fit the exact job at hand, but I don’t think we should let wishing we had them stop us from replacing a hammer with a screwdriver that isn’t quite the right size, instead of just bashing away or giving up entirely.

steven.daprano:

One of your paragraphs is:

As this change does not affect the meaningful syntax of reST (or Markdown) files, nor the rendered output, there is no direct backwards compatibility impact. We could progammatically convert files from hard breaks to OSPL, but particularly on the PEPs repo, this is not really necessary, aside from authors who may wish to do so when rewriting their draft PEPs. Rather, future PEPs can simply be written using OSPL.

Written in OSPL style, that becomes:

As this change does not affect the meaningful syntax of reST (or Markdown) files, nor the rendered output, there is no direct backwards compatibility impact.
We could progammatically convert files from hard breaks to OSPL, but particularly on the PEPs repo, this is not really necessary, aside from authors who may wish to do so when rewriting their draft PEPs.
Rather, future PEPs can simply be written using OSPL.

Those three lines have lengths of (approx) 150, 200 and 50 columns respectively. In my text editor, with word-wrapping turned off, the first two are essentially unreadable due to the need for horizontal scrolling.

With word-wrapping set to 80 columns, I get:

As this change does not affect the meaningful syntax of reST (or
Markdown) files, nor the rendered output, there is no direct backwards
compatibility impact.
We could progammatically convert files from hard breaks to OSPL, but
particularly on the PEPs repo, this is not really necessary, aside from
authors who may wish to do so when rewriting their draft PEPs.
Rather, future PEPs can simply be written using OSPL.

To me, that looks like missing blank lines between one-sentence paragraphs, not a three-sentence paragraph.

(As noted, I quoted your whole passage so others could view the intended rendering)

We all have our own visual preferences and habits, so I certainly don’t want to dismiss the validity of your own personal observations—to each their own, of course. I do note, though, that I was originally used to the hard-break style myself, and it did take some initial adjustment when we switched over our various docs, etc. repos, but now it looks natural.

guido · February 24, 2022, 11:34pm

SHOULD hardly sounds like a recommendation to me. It’s the strongest of these “standards” verbs except for MUST.

CAM, can you please back down on this whole proposal? I worry that you are ruining the PEP process (with the best intentions, but still).

Also try to be less verbose.

Please.

Jelle · February 25, 2022, 3:10am

As a PEP editor I haven’t found the current scheme problematic. I don’t particularly care how PEP authors break their lines as long as the rendered output looks good and the source text is reasonably easy to read. A formalized line breaking style may make it harder for people to write PEPs and doesn’t offer much benefit.

I’d like instead to relax the current guidelines in PEP 1:

You must adhere to the Emacs convention of adding two spaces at the
end of every sentence. You should fill your paragraphs to column 70,
but under no circumstances should your lines extend past column 79.
If your code samples spill over column 79, you should rewrite them.

We can just say “Lines should usually not extend past column 79” and leave the rest to the judgment of the author, which is what we’re doing in practice. Recent PEPs don’t actually follow the two-space convention; I think we should drop it.

AA-Turner · February 25, 2022, 4:24am

I’d support @Jelle’s phrasing, it seems a pragmatic approach.

A

pitrou · February 25, 2022, 12:13pm

It seems to me that “one sentence per line” would dramatically reduce readability. It is not how texts are normally laid out (on whatever medium: paper or screen) and it has therefore a high adaptability cost. Also, there are probably good reasons why this is not a common convention at all, and I’m a bit baffled that the PEP process would try to innovate in terms of text presentation.

hugovk · February 25, 2022, 3:41pm

Thanks! GitHub’s diff looks good on desktop. It’s okay on mobile but perhaps a bit harder to read?

I normally have it turned on, but have just installed BBEdit on a new machine and the default is word wrapping off, so it must be pretty common?

I definitely don’t spend hours editing on mobile, but use it a fair amount for smaller edits. Here’s the above OSPL example on mobile:

A less common use case, but another place where OSPL struggles is the GitHub issue/PR comment box. Editing is fine:

But you get long horizontal scrollbars in preview and when posted/rendered:

barry · February 25, 2022, 5:49pm

+1

This seems related to wanting to use an autoformatter for Python. Some people like black, I prefer blue but at least I don’t have worry about style any more, and I especially don’t have to worry about ensuring I’m writing my code in the correct style, because the tool just fixes it for me. Can’t we do something similar here? If not, will my PEP PRs start to be rejected because I’m not adhering to an implicit style?

erlendaasland · February 25, 2022, 6:17pm

+1

Perhaps it is time to revisit this thread again:

pf_moore · February 25, 2022, 6:48pm

If so, then let’s do so over there. This thread is about a style recommendation for PEPs, which are text, not code. And it’s actually about the opposite of auto-formatting, being a proposal that PEPs should be written in a certain fairly strict style, which currently has to be done manually.

vstinner · February 26, 2022, 12:39am

I’m a bit of worried by that. On my recent PEP changes (example), I got multiple suggestions to fix typo and grammar, but also wording or other changes. English is not my first language, so I still do make many mistakes. But it’s also part of my “style”, like my “signature” (?). And I’m fine with it In an old PEP, I accepted so many suggestions from various persons that at the end I no longer recognized my own text!

Well, I know that some people do care a lot about grammar and that typos can make reading harder.

Some PEPs like PEP 670 and PEP 674 are the results of months, if not years, of work (not just the PEP itself but all the work made on advance to be able to reach the point when a PEP can be written), so I would prefer spending more time on discussing the actual change proposed by the PEP, rather than spending too much time to trying to write the ideal phrasing in English. It’s not like I’m writing a book which should be printed on paper.

I don’t know. Maybe it’s just about the process which is inconvenient to me. Maybe people who hate grammar mistakes and typos should propose a PR once my change is merged?

Anyway, thanks for reviews