Keyword Ambiguous Syntax

[Sincere apologies: My ideas was been reorganized by a LLM (and been revised by me), It will not go happen again. Breaked english, and Google Translator screws me more than help me]

Preliminary note
Please don’t argue that I should pick different names with similar meanings, add a trailing underscore, or claim that some names “don’t make sense” to be declared. That is subjective and cannot be stated with certainty. Those are just workarounds that avoid the actual problem being discussed.


Ambiguity as an argument against syntax proposals

A common argument against new syntax proposals is that they would be ambiguous and therefore go against Python’s principles.

The issue is that this argument already applies to Python itself: keyword-based syntax is not as uniform as it is often presented, and the language already relies on exceptions to deal with that.

The clearest example is the introduction of soft keywords — words that behave like keywords only in certain contexts, while remaining valid identifiers elsewhere.


Soft keywords acknowledge the problem

Soft keywords are not just an implementation detail. They exist because strict keywords break existing code and restrict the available namespace too much.

Examples that are valid in Python:

type Age = int     # ok (soft keyword)
type = 'anything'  # ok
match = 'anything' # ok
case = 'anything'  # ok
_ = 'anything'     # ok

These examples show that Python already accepts that the same word can act as a keyword in one place and as an identifier in another.


Inconsistency between soft and hard keywords

At the same time, many keywords are still completely forbidden as identifiers, even when the surrounding syntax makes their meaning obvious:

def pause(self): ...        # ok
def continue(self): ...     # SyntaxError
def break(self): ...        # SyntaxError
def as(self, type): ...     # SyntaxError
def with(self, obj): ...    # SyntaxError

Another example:

importlib.import        # SyntaxError
importlib.import_module # ok

In these cases, there is no real confusion for either the reader or the parser. After a dot (.) or inside a function definition, this is clearly an identifier, not a control-flow statement.

The restriction is global and syntactic, not contextual.


Statements, values, and keywords

Statements usually do not produce values, which partly explains why some of them can be treated as soft keywords. Still, most statements remain hard keywords, and the line between the two is not consistent.

This is not a technical limitation, but a design choice.


About literals: None, True, and False

I intentionally leave None, True, and False out of this discussion for a simple reason:

  • They are core values of the language
  • They must be immutable
  • They behave like constants

Preventing reassignment of these names makes sense. However, that does not mean the same word should be forbidden as an identifier in every possible context.

The real issue is that Python does not try to distinguish whether a name refers to a language literal or a user-defined identifier. That points to a limitation in the grammar, not a conceptual requirement.


Why explicit prefixes help

This is where explicit prefixes used by other languages become relevant:

$continue   # variable
continue    # keyword
$True       # variable
True        # keyword

With this approach:

  • There is no ambiguity
  • Names do not need to be globally forbidden
  • The programmer’s intent is visible at the lexical level

Python is not the only language that made this trade-off. Others made similar choices. The difference is that Python also claims to avoid ambiguity as a guiding principle.


Conclusion

Given Python’s own principles — especially “Explicit is better than implicit” and “In the face of ambiguity, refuse the temptation to guess” — the current model based on globally hard keywords deserves reconsideration.

Otherwise, it gives the impression that the language relies on restrictions instead of clear mechanisms, while still claiming to avoid ambiguity.

Maybe not in today’s Python — but perhaps in a future version.

No. Don’t plan for something in “a future version” where backward compatibility has become irrelevant. That’s not going to happen. Either it’s a possibility, or it’s not.

Your entire argument is based upon one very important assumption that is perhaps not obvious: You are assuming that code is correct. You say, for example, that this is unambiguous:

def with(self, obj): ...    # SyntaxError

And that is true - if we already know that the code is supposed to be exactly that. But part of the value of hard keywords is that they DRASTICALLY reduce the possible misparsings of incorrect code. Look at this code and tell me if it’s still unambiguous:

def func(stuff):
    def with(stuff):
        stuff.frobnicate()

If you had only soft keywords, this would work. But would it be correct? I’ve provided a couple of hints that suggest that it isn’t, and yet, with only soft keywords, this would silently do nothing.

If all code in the world were perfectly correct, we wouldn’t need keywords.

1 Like

I agree that hard keywords are valuable as a defensive mechanism. They reduce the number of possible parses when code is incorrect, and they help turn certain mistakes into immediate syntax errors. Your example shows that clearly.

Where I think we differ is in what problem we’re optimizing for.

My argument is not that hard keywords are useless, or that everything should be a valid identifier. It’s that Python already accepts context-sensitive behavior when the cost of global restrictions becomes too high, and that means “ambiguity” is not treated as an absolute rule, but as a trade-off.

In your example, the issue is not that the code is ambiguous. It has a clear parse and a clear meaning. The concern is that it might not be what the author intended, and that’s a real concern. But Python already allows many cases where incorrect or incomplete code is syntactically valid and only caught by tooling, tests, or review.

So the question is not whether hard keywords help catch mistakes — they do — but whether permanently reserving names in all contexts is the only or best way to achieve that, especially in places where the surrounding syntax already makes the intent clear.

In other words, this is less about assuming all code is correct, and more about whether global, context-free restrictions are always the right balance between safety, clarity, and expressiveness.

Keywords should only be soft if it’s always, without any further conditions, 100% clear if they are keywords or identifiers in all contexts they can appear in.

This is e.g. not possible for import: import .a is perfectly valid syntax right now, using import as a keyword. If it became a soft keyword this could instead be parsed as import.a. Therefore import can never be a soft keyword. This applies to a lot of current hard keywords, but not all.

(also, a reminder to please not post LLM output without disclosure: Broken English is better than LLM filtered English for communication, and it’s also against the TOS of this side.)

4 Likes

This is incorrect. The only places where soft keywords are used is where we’re introducing functionality that depends on new keywords but we can’t introduce those keywords as proper hard keywords because people already widely depend on those words as names.

If we were starting over, match and case would likely just be proper keywords. Same with lazy and type. A lot of things unrelated to keywords would also be different but that’s besides the point.

The reason soft keywords are not “get out of jail free” cards is that they are tricky to parse. Writing rules for them in the grammar is hard, as you need to understand how ordering of the rules will affect what gets interpreted. Emitting helpful error messages depends on special rules for error conditions, those are also harder to introduce for the same reason. And finally, soft keywords make parsing slower as they force the parser to look at more rules.

This is why, if we could, we would prefer hard keywords over soft keywords. This is why async and await, first introduced as soft keywords, were converted to proper keywords later.


As a general note, I’m leaving this discussion up as a way to communicate what I said above, but let’s be clear: soft keywords are costly and we will try real hard not to use them if at all possible. And introducing stuff like $variable is right out. It will never happen in Python. Ever.

10 Likes

I think you achieved in exact point: It suggests a design issue. I believe that prefixed identifiers could resolve this issue to differentiate. Something like $import.a - where $ is an generic token example. Of course, could be any other way, no exactly using prefixes - but it being explict.

(noted, thank you - this will take time in my words without using GT)

The answer that I expected. Totally predictable.

Yes. Adding $variable would be anathema to many in the community. The community has learnt lessons in how to discuss controvertial topics from the walrus operator introduction but $variable wuld hopefully just die a quiet death as being too much of a change for most.
(I develop a nervous tick at the thought of such a blatant Perl-ism ending up in Python - gotta take my meds) :slight_smile:

4 Likes

If you are forced to use a prefix to resolve ambiguity, why should it be $? Try other prefix, for example my_: my_import.a. Oh, you do not even need to make keywords soft for this.

2 Likes

This is not the same thing. Definitely.

class MusicPlayer:
  def continue(self): ... # SyntaxError
  def mp_continue(self): ... # Ok
  def mp_pause(self): ... # Ok
  def mp_stop(self): ... # Ok

MusicPlayer().mp_continue() # Its serious?

Totally different:

>>> my_import = None
>>> locals()
{..., 'my_import': None}
>>> del my_import

Closest expected behavior:

>>> locals()['import'] = 'my data'
>>> locals()
{..., 'import': 'my data'}
>>> import # SyntaxError
>>> locals()['import'] # Ok
'my data'
>>> class MusicPlayer:
...     def _(self): return 'my data'
...     _.__name__ = 'continue'
...     locals()['continue'] = _
...     del _
...     
>>> dir(MusicPlayer)
[..., 'continue']
>>> MusicPlayer().continue # SyntaxError
>>> getattr(MusicPlayer(), 'continue')() # Ok
'my data'

Note for all: I am not saying I agree about every keywords should be Soft Keywords. I only pointed the issue and possible generic examples for solutions. I don’t intend to provide a solution - that’s not my place.

Then what’s the point of this thread?

1 Like

You can usually find synonyms for the words you wish to use. In this specific example I’d name the method resume instead.

A stronger argument for making more existing hard keywords soft can be made for naming attributes for frameworks that model existing database columns with class attributes, where names such as class and from are common, but even then a column_name attribute for a field instance are usually simple enough a workaround to keep the argument from building into a compelling enough case for Python devs change anything at the cost of parsing efficiency.

Using a symbol other than an underscore as a prefix of an identifier is not going to fly because Python syntaxes are designed to be intuitively readable by people who read any code for the first time so Python only adopts a symbol in the syntax if it bears a widely recognized meaning.

2 Likes

To find one solution not being ‘change to a different identifier name to avoid this issue’ or similar and changing the grammar if necessary (New keyword? New prefix? I don’t know)

But you’re asking someone ELSE to actually do the work of coming up with a proposal.

If someone has an idea to discuss, they can start their own thread, they don’t need this one. This thread has no actual proposal, you said so yourself. So what is there even to discuss?

I really don’t understand how this topic ended up in ‘Ideas’, I am sure I posted it in ‘Python Help’. I don’t know if I made a mistake or if it was an erroneous category correction.

So, for summarize: I’m not presenting a proposal, I’m presenting a issue, and anyone interested can discuss solutions using the problem I’ve detailed as a reference.

This is a issue that bothers a lot of people, in the end it forces you to use different names. Even in SQLALchemy you can see model.column.is_() or model.column.in_() as workaround.

2 Likes

This whole thing hinges on whether or not you do consider it to be bothersome that you can’t use keywords as variable names. For me, whilst I wish there was a bit more of a standard pattern for mangling a variable name that otherwise clashes with a keyword[1], it’s such a non-issue that I’d even rank it beneath the mild sadness I feel when I write

re.match("...", "...")

and the word match gets syntax highlighted as if it was a keyword.


  1. preferably everyone would always append an underscore rather than spelling class with a k ↩︎

1 Like

A proper syntax highlighter that actually follows Python’s grammar rules, such as Microsoft’s Python extension for VS Code, does not highlight match in re.match.

That’s indeed a good example where it does make sense to name the method with a keyword, since is and in are keywords in SQL and the whole point of SQLAlchemy is to model SQL operations with Python schematics so using synonyms doesn’t make sense.

However, I don’t see your proposal of using a prefix such as $, which allows model.column.in_() to be rewritten as model.column.$in(), necessarily helps the code read better in meaningful ways, as both involve additional symbols anyway.

The best solution is still to make those keywords soft, and if you can prove through a working prototype that the impact to parsing efficiency is negligible when you turn those keywords soft, you may actually have a good case for a change.

2 Likes

Making is and in soft would make sense and wouldn’t be ambiguous since they require passing arguments.

But what about the other keywords? It wouldn’t be possible to infer whether break, continue, return, pass are keywords or identifiers because they don’t require passing arguments.

This is a clear example that it’s not possible to oversimplify a grammar without incurring disadvantages.

So, It seems to me that is not possible to solve this issue completely using only soft keywords.