Summary: I propose that a new compile time directive be available to restrict the Python syntax to a strict subset. This would facilitate the teaching of Python to beginners as well as the work of people that write tools intended to help beginners learning Python.
= = =
Note: in what follows, I refer to text (blog posts, tweets, comments on mailing lists) by various people. In doing so, I may not accurately represent their opinion. Any such misrepresentation is definitely not intentional.
~ ~ ~
Since its creation, the popularity of Python has been steadily increasing. According to some measures, such as the Tiobe index, Python is now considered to be the most popular programming language in the world. For many years, Python has been a favourite first language to teach to beginners, even famously replacing Scheme as a first programming language taught at MIT in 2009
However, Python is also becoming more complex, as can be seen by reading this post by Brett Cannon. As more and more features are added, Python becomes harder to learn for beginners. To be fair, Brett Cannon does point out in his post how the introduction of advanced features and of more helpful features to beginners can sometimes be interrelated. However, I would argue that it might be possible to decouple the two as described below.
One of the first concepts one encounters when learning programming for the first time is that (most) languages have reserved words (keywords) that cannot be used as names for objects in a user’s program.
With the introduction of Stuctural Pattern Matching PEP 635 (see also PEP 636) Python introduced two “soft keywords” (namely ‘match’ and ‘case’) that can be thought of as “context dependent keywords”, something that is not as straightforward as the idea of reserved words.
Due to the addition of these soft keywords, parsing Python code became much more complex and required a change to a new type of parser (PEG parser) which even experienced programmers find challenging to master.
This ever-increasing complexity can make it more difficult to create and maintain tools designed to help beginners understand what went wrong in their program or to help them avoid errors. In addition to well-known linters, such as flake8 and pylint, such tools include Pyta, Pedal, the editor Thonny with its integrated “Assistant” which can be further extended using error-explainer, friendly, and undoubtably many others I am not currently aware of.
While there is no doubt that many more features will be added to Python in the future, I propose that a subset of Python’s syntax, which I will refer to as Python 101 hereafter, be identified and made available via a directive: with a Python 101 mode activated, any syntactical construct outside of this restricted subset would give rise to a SyntaxError.
Something like Python 101 would flag advanced syntax as invalid and help educators teach their students the basics of Python programming before moving on to more advanced topics.
Like Greg Wilson wrote
It’s easy to say, “Just ignore decorators and async I/O and the
:=operator in class,” but that’s disingenuous. Newcomers will bump into these things as soon as they search online for help, because they actually are helpful for people who are programming in the large; that’s just not my use case.
The idea of having a reduced syntax for a given language as a teaching help is not new. For example, Racket lists five different language subsets designed to gradually introduce different concepts in a teaching context. As an aside: in addition, Racket includes other dialects such as a version that includes type annotations.
I do not propose to go as far as Racket, but simply to have one reduced subset of Python as a useful tool in a teaching context.
from __future__ import ..., Python has already a mechanism in place for conditional parsing of different syntactical constructs. So, from that point of view, having a restricted subset of Python’s syntax recognized as valid is not something that would be strictly speaking new.
However, instead of requiring an actual “normal” line of code to be added to a program to restrict its grammar, I suggest that a directive enabled by a top comment, like what is done for specifying an encoding, would be preferable. Such a directive could look as follows:
# syntax: Python-101
While beginners are taught that “comments are ignored by Python”, this is certanly not the case for encoding declaration. These encoding declarations are one of the first topics covered in the official Python tutorial. I believe that having a second small inconsistency would be worthwhile as it would enable people writing books and tutorials today to tell beginners to write such a directive at the top of their program, letting them know, while this comment has no effect on existing Python versions, that future versions of Python will make use of this comment.
Even for those that agree with the idea of enabling a restricted subset of Python available via a directive, I suspect that there might be significant difference of opinions as to what should be included in Python 101. Below, I simply offer an opinion. I am much less attached to any individual suggestion mentioned below than to the principle of having something like Python 101 made possible.
As a first principle, given a specific Python version (say Python 3.12), any program that is syntactically valid with a Python 101 mode enabled by a top comment directive, should remain so and have an identical runtime behaviour if that top comment directive were to be removed. This means that no additional syntactical construct, such as having a
repeat keyword as is done in TigerJython or my own Reeborg’s World. While it might be pedagogically useful to have additional keywords such as
noexcept instead of using
else in loops or
try/except blocks, such keywords should not be part of Python 101 unless they were valid in a “full” Python version.
As a second principle, there should be no ambiguity as to what a keyword is. Thus, Python 101 would not include context-dependent (soft) keywords such as
Going through the first 10 sections of the official Python tutorial, prior to the quick tour of the Standard Library, I did not see any mention of the assignment expression operator,
:), introduced in PEP 572 Given the risk of confusion with the normal assignment operator for beginners, I would argue that Python 101 should not include this relatively new operator.
Similarly (and somewhat to my surprise), I did not see any mention of decorators in the official Python tutorial, even in the context of discussions of classes. This is consistent with my perception of decorators as a topic beyond the scope of what would be expected to be included in Python 101; they would thus be excluded.
I would also argue that the keywords “async” and “await” should be flagged as SyntaxError, with a custom message, when Python 101 mode is enabled. The relevant usage of these keywords requires a level of understanding beyond what should be expected of beginners taking a first programming course in Python.
I feel somewhat the same, but not as strongly, with the keyword “yield” and creating generators.
While the keyword “lambda” is thought to be “confusing and non-intuitive”, I would argue that it should be included in Python 101 as it is often required to make simple Tkinter based applications, which may be covered near the end of a beginner’s class.
Finally, I would argue that type annotations should not be allowed as valid syntax in Python 101. While there is no doubt that type annotations are useful for advanced programmers, especially those working with large code bases, I would argue that they add unnecessary complications to Python’s syntax for beginners. For example, see [Miss annotation, but no NameError? - Ideas - Discussions on Python.org], https://twitter.com/reuvenmlerner/status/1290317124997648386. Also, see The current state of typing PEPs, a long discussion on the Python-dev mailing list, including, in particular, this comment by Christopher Barker.