"""This is not ok because from is a keyword. My proposal is to allow this:"""
class Test:
from: int = 10 # Throws a SyntaxError
to: int = 10
test = Test()
print(test.from)
This is especially useful for dataclasses from API responses that especially tend to use the from keyword in for example times:
import dataclasses
@dataclasses.dataclass
class TimeInterval:
id: int
from: datetime.datetime
to: datetime.datetime
I ran into a similar situation many years ago, when implementing an API to an external simulation program. The modeling language had an element with various properties, two of which were âprintâ and âreturnâ.
In our ORM, you could do stuff like this, but those last to sent us to use setattr instead of the obvious reserved word naming.
part = model.Part()
part.mass = 100
part.location = 10,0,20
debug = model.Debug()
debug.print = True # Oops. Works after 3.0, but in the 2.x days it failed.
debug.return = "OnError" # Still fails.
If Iâm remembering right, your idea has come up before, now that the PEG parser would fairly easily allow all these traditionally reserved words to become soft keywords.
Thank you for the input. If I understand you correctly this idea has been brought up before and is also something you would have found useful in a project of yours a while back? And it is probably easier to implement now that CPython uses a PEG parser?
I tried searching for an active issue about this in the CPython GitHub repository, but couldnât find any. I thought of creating an issue about this myself. The only thing holding me back is any potential obvious caveats that I have not thought of, so I decided to ask here first
My recollection is that there was discussion on the python-ideas mail list back when PEG was formative, this being one of the arguments to go ahead with that project. I donât recall if it predated the svn â git conversion, but may very well have (or been coincident with the switch), so Iâm not surprised thereâs nothing on github.
The tradition is to append a _ to the name, e.g. from_ when something clashes with a keyword or built-in object name.
Yes, but that isnât actually a good thing. We purposefully only use that for extreme situations where new syntax strongly makes sense to, like with match where there was a chance of clashing with pre-existing variable names. We still prefer to keep the grammar relatively simple.
Sure, but (in my cases at least) itâs likely youâll be de/serializing the class into JSON or whatever, so now you have to configure your serialization library to do a rename. Which is generally possible, but frustrating when itâd be so much easier if the language let you use the right name from the start
Iâm assuming thatâs because you donât control one of the endpoints and thus the format isnât under your control. In which case that sucks and Iâm sorry that Python puts you through that if youâre converting this into object attributes (which I will assume the maintainer of cattrs is doing ), but thatâs not going to convince the SC to make all keywords soft for the entire language (I was a part of that discussion around match and bringing in the PEG parser so Iâm not speculating here).
I would like to see the ability to specify an arbitrary string as attribute name for purposes like this.
AFAIK there is no keyword that can ever directly follow ., so I think it should not be too hard to argue that . would force the following word to always be parsed as an identifier rather than a keyword.
For cases where there is no . before the identifier name, such as class attribute annotations as mentioned above, but also kwarg names at call sites (e.g. when constructing a dataclass), perhaps it could be allowed to use a leading . to likewise parse the following word as an identifier.
For consistency, it would also be allowed to use . to indicate an identifier even if there wasnât a keyword collision. This would allow people to use . for every name in a context where at least one name required it, whichâd be a bit tidier than only using this workaround for keyword collisions. So this would become valid:
@dataclass
class Foo:
.from: Bar
.to: Bar
foo = Foo(.from=some_bar, .to=some_other_bar)
print((foo.from, foo.to))
An alternative is to use a syntax including single or double quotes to specify âtreat this string as an identifier nameâ. The Zig language uses the syntax@"name", as in, foo.@"from". In Zig this syntax allows any valid string to be used, so you could have foo.@"bar$baz" (resulting in accessing an attribute named bar$baz, which might be helpful if youâre somehow interfacing with something in Java or JavaScript where $ may appear in an identifier) or even things like foo.@"a.(very ; strange\"name". If Python were to use similar syntax Iâm not sure if it should restrict the content to still match /[a-zA-Z_][a-zA-Z0-9_]*/ or if it should allow any arbitrary string like Zig does, but it should at least allow to overcome keyword collisions.
Yes, to both of you, those are the solutions I use.
The first doesnât work when the attributes youâre dealing with come from some external source that isnât bound by Pythonâs keyword list, and the second is somewhat untidy.
Iâm really just putting my 2 cents out here for consideration.
The new PEG parser in Python can actually manage the meaning of keywords depending on context, but having worked with languages that do that (COBOL, NATURAL, PERL), and considering the small set of Python keywords, Iâd opt for keeping the set of keywords small, and avoiding them as identifiers in programs.
Because the original example is over Foo and Bar itâs difficult to recommend alternate naming, but usually a longer and more explicit name resolves the collision with keywords, for example from_position, or initial.
In the end a language should be designed for the easy reading (by humans) many times, versus ease of writing.
I agree that making all keyboards soft is probably too disruptive and can easily make code unreadeable (like, imagine a comprehension with variables named for or ifâŠ)
However, ub the specific case of from maybe it can be relevant? Iâve found myself several cases in the exact situation of the original post, needing an additional rename step to serialize/unserialize data using âfromâ and âtoâ fieldsâ.
Since from can only be used in very specific context (from ... import ...), maybe making it a soft keyword has a positive balance?
REXX has no keywords whatsoever, and thus you really could do that sort of thing⊠itâs a blessing and a curse! The blessing is that you can have extremely situational keywords (eg the âPARSE VALUE x AS yâ statement, in which âvalueâ is a keyword - and keywords are case insensitive, so itâs great that that doesnât stop you from using âvalueâ as a variable name), but the curse is exactly what you say: itâs entirely possible to write extremely unreadable code.
Having a number of hard keywords helps a lot with hard error messages. When there are multiple ways you could potentially parse something, itâs possible - and all too common - to have âgarden pathâ sentences where you have to back up a long way and reinterpret what you thought you already understood. (Youâve probably heard that time flies like an arrow, and fruit flies like a banana.) That then leads to errors being reported a long way from the actual bug, since everything prior to that was perfectly legal and grammatical, but might not have had the interpretation you intended.
Good use of keywords can help to prune the grammatical tree early, enforcing what is to be interpreted. Yes, sometimes it excludes an otherwise-valid interpretation, but thatâs the price paid for the 99% of the time when itâs beneficial. Languages with almost no keywords tend to have a lot more grammar words to them (like REXX that I mentioned earlier), to help guide that interpretation.
Itâs a language that I spent many years working with, and have no regrets about! Great language. Imagine a shell scripting language that gets enhanced with some more features to make it a general-purpose language. Think like bash scripts, but even more so. Now add in the ability for extension libraries (where shells generally just spawn subprocesses for everything), some GUI libraries (VREXX, VPREXX, VX-REXX), and a few things like that, and you have a quite viable scripting language. Plus, OS/2 made it really easy to call on the REXX interpreter from another program (think like embedding CPython, only the interpreterâs actually provided by the OS), so REXX scripting was the single most popular embed language on OS/2, making it the universal language.
The strongest opinion I have is âdonât change itâ, and thatâs not a particularly strong opinion. But there needs to be a strong case for the change, and I think the reasons given here are not much stronger for âfromâ than for any other keyword.
And I do have a strong opinion on âmake them ALL softâ, which is what I said above. So while I wouldnât stand in your way if you think âfromâ is special, Iâd also not get behind that argument without a bit more explanation of what makes this, in particular, either more beneficial for non-keyword use or more harmful as a keyword.
From personal experiences, it is close-to the only one I ever encountered. This is also supported by a quick count in the stdlib. The most common usage as an identifier is assert_, next is from_, followed by class_. The testsuite also uses import_ a lot.
assert_ is basically only used in a single file, itâs a helper function in wsgiref/validate.py that behaves like a normal assert statement except that it doesnât get optimized away. Not sure why it exists.
class has a well known ok-enough spelling cls. Specfically class_ is used about as much as from_ in the stdlib
import is in the stdlib at least only used by tests or importlib, which are kind of specialized usecases
from_ is used a bit more all over the place, tkinter (and itâs family, idlelib and turtle) and mailbox use it, as well as tests for unicode.
from OTOH doesnât have a clear alternative spelling I am aware of and it is quite a common name for a parameter in more abstract definitions of protocols. In python it is only used for the from ... import statement, which means itâs only ever used in a very clear context and in relation to another keyword, so typo detection wouldnât really be hampered by turning it into a soft keyword. (although ofcourse, relative imports from.module import ... come to mind as slightly conflicting.
Not that I am necessarily in favor of changing itâs status. But it is IMO a bit special in contrast to most other keywords.
Okay, out of curiosity I did a quick GitHub search of all keyword_ usages in Python files, which is the canonical way to bypass hard-keyword limitation
Here are the results (sorted from highest count to lowest count):
Itâs also part of raise and yield. But in the case of import, the word from is at the START of the statement. That makes it a bit trickier to replace, since thatâs a good way to end up with a garden-path statement. Currently, if you see the word from at the start of a parsing context (say, when youâre expecting a new statement), you expect it to be followed by an importable thing, then import, and a thing to import from it. But if from were a soft keyword, the word on its own would be a valid expression, something like this:
try:
raw_input # an expression with just a name
except NameError:
raw_input = input
So, if I have statements like these, where are the bugs?
from (spam) import ham
from spam; import ham
Itâs entirely possible to interpret these as from-imports, but also as function calls or perhaps assignments. Where do you pinpoint the error, and how much confusion will it cause?
Soft keywords make a lot of sense in contexts where they canât possibly occur at the start of a statement. For example, âawaitâ canât (normally) appear outside of a function declared with âasync defâ, which means that the keyword was able to be introduced as a soft keyword - at top level, you could say âawait = 1â or âdef await(x)â without issues, but if you wanted to say âawait thing()â, that would happen in the clearly-defined context of an async function. Similarly, âasâ could be made a soft keyword, I think (although thereâs not a lot of call for it), since every use of it follows some other keyword (import, with, except, match), so it would be unambiguous.
Iâm not really surprised about class in your list. Even with the common abbreviation cls, itâs still going to be extremely common. But it also would be a poor choice for a soft keyword IMO, since it always starts a statement. in might be a better choice, and from as mentioned is somewhere in between. It all depends on what kinds of confusion it would cause by permitting it, compared to what kinds of confusion you get by rejecting it.
Oh right, yield from and raise ... from forgot those, I almost never use them. Those do actually make me not want to have from as a soft keyword. yield from and forgetting the rest of the statement does not seem like an impossible mistake nor does yield from seem like an unlikely intended statement. Your two example with from ... import to me seem less likely, but do ofcourse exists. Also, âbeginning of a statementâ is clearly not an indicator since we have match and case as soft keywords [1]. Quite the opposite IMO, if soft keywords might be part of expressions as keyword I would find it more confusing and harder to produce good error messages for (so I like in even less that from. from in expressions is at least a context dependent special form)