I’ve seen some other threads explain that PEP 8 is more of a guidance on how Python code should be formatted, not necessarily a requirement. (Granted I think it is a requirement now to contribute to the std library.)
While I agree that PEP 8 does not require anyone outside of project contribution to adhere to the recommendation, the majority of IDEs and projects work off of PEP 8. Because PEP 8 represents a standard to follow that encourages clean, readable, maintainable code, I would like to suggest a tweak that could increase the readability of a lot of code without much cost.
I’m really curious what others in the community think about this
Background
The current guidance in PEP 8 discourages extraneous whitespace for aligning assignments or other constructs, prioritizing simplicity and avoiding unnecessary visual clutter over alignment for purely aesthetic reasons. While this guidance works well in many scenarios and is easy to apply, there are cases where alignment significantly improves code readability, particularly in settings with related variables or structured data.
I want to propose an update to Python’s style guide regarding the use of whitespace for aligning assignments and similar constructs. The goal is to provide more nuanced guidance that balances readability and simplicity while discouraging excessive alignment practices that may lead to overly complex formatting. The proposed changes address the community’s growing need for clarity on this topic, as evidenced by feature requests in popular tools like PyCharm. Note the number of issues that are similar, duplicates, and related.
I know unrestricted alignment can lead to overly complex formatting, making code harder to maintain and even read in some cases. This proposal seeks to strike a balance by refining the guidance to allow alignment in specific scenarios where it enhances readability, without encouraging overuse.
Current Example Discouraging Alignment from PEP 8 (as of 2025-01-03)
x = 1
long_variable = 2
PEP 8 currently marks this as incorrect, recommending the simpler:
x = 1
long_variable = 2
Issue:
While the above guidance discourages unnecessary alignment, it does not consider cases where alignment can improve readability by enabling cognitive grouping of related variables, making the structure and relationships between values clearer at a glance, such as:
min_value = 0
max_value = 100
average = 50
Here, alignment groups related variables and enhances visual structure.
Proposal
-
General Principle: The primary goal of alignment should always be to enhance readability. Avoid alignment if it introduces complexity or makes code harder to maintain.
-
Allow Grouped Alignment: Alignment is permissible within logical groups of related variables where it improves clarity. For example:
width = 800 height = 600 depth = 256
However, avoid aligning across unrelated code blocks or sections.
-
Limit Scope: Avoid excessive alignment or overuse, such as aligning across significantly different variable lengths or large code blocks. Consider breaking variables into shorter, logical groups if alignment spans more than a few lines.
Backwards Compatibility
The proposal does not mandate alignment but refines guidance. Existing codebases remain “compliant” if they follow current PEP 8 recommendations. This gives IDEs the freedom to implement the requested features while still being “in compliance” with PEP 8.
More Examples
Good Alignment:
Two Variables:
x = 10
y = 20
Three Variables:
name = "Alice"
age = 30
occupation = "Engineer"
Six Variables:
red = 255
green = 0
blue = 0
alpha = 1.0
width = 800
height = 600
Good and Bad Examples for Complex Assignments:
Good Example:
val_set = get_value_set(vals)
thread_safe_count = ThreadSafeInt(len(val_set))
chunk_size = thread_safe_count / len(THREAD_COUNT)
val_chunks = divide_by_chunks(val_set, chunk_size)
threads = list()
printers = math.ceil(THREAD_COUNT * 0.33)
start = time.time()
or
val_set = get_value_set(vals)
thread_safe_count = ThreadSafeInt(len(val_set))
chunk_size = thread_safe_count / len(THREAD_COUNT)
val_chunks = divide_by_chunks(val_set, chunk_size)
threads = list()
printers = math.ceil(THREAD_COUNT * 0.33)
start = time.time()
or
val_set = get_value_set(vals)
thread_safe_count = ThreadSafeInt(len(val_set))
chunk_size = thread_safe_count / len(THREAD_COUNT)
val_chunks = divide_by_chunks(val_set, chunk_size)
threads = list()
printers = math.ceil(THREAD_COUNT * 0.33)
start = time.time()
Bad Example:
val_set = get_value_set(vals)
thread_safe_count = ThreadSafeInt(len(val_set))
chunk_size = thread_safe_count / len(THREAD_COUNT)
val_chunks = divide_by_chunks(val_set, chunk_size)
threads = list()
printers = math.ceil(THREAD_COUNT * 0.33)
start = time.time()
The bad example lacks alignment, making it harder to visually parse related variables and their values.
Avoid Overuse:
Two Variables:
short = 1
very_very_long_variable_name = 2
Instead, prefer:
short = 1
very_very_long_variable_name = 2
Three Variables:
id = 12345
short_name = "A"
very_long_variable_name = "Example"
Instead, prefer:
id = 12345
short_name = "A"
very_long_variable_name = "Example"
Group large sets by length:
var1 = "A"
var2 = "B"
var3 = "C"
long_name_1 = "D"
long_name_2 = "E"
long_name_3 = "F"
Instead, group logically:
var1 = "A"
var2 = "B"
var3 = "C"
long_name_1 = "D"
long_name_2 = "E"
long_name_3 = "F"
Sorry for the long read, thanks for getting here