Currently, Python does not distinguish between variable declaration and assignment, which allows variables to be redefined. This can lead to potential security issues. Here are some scenarios I can think of that always make me feel uneasy when writing Python code.
Scenario 1:
Functions have their own scope, and new variables used within a function will automatically be contained within that function while also having access to variables in the outer scope. However, when accessing variables from the outer scope, we might also need to assign values to them, which can easily lead to accidentally creating a new variable contained within the function. Note that this is not saying the problem cannot be solved, but rather that it easily leads to errors. This requires programmers to be mindful of this issue, but the fact that programmers need to be mindful of this issue ultimately stems from Python’s design flaw. Languages with variable declaration concepts would not have this problem.
Scenario 2:
In larger projects, functions with hundreds of lines of code frequently appear. While we might be aware of issues caused by variable redefinition when initially writing code, I often need to be extremely careful when creating new variables in certain scopes while modifying a huge application. I need to check whether corresponding variable names already exist above, and I’m always worried about accidentally overwriting function parameters or previously defined variables when modifying code. This feeling is terrible, especially since everyone has moments when they’re not thinking clearly, which only increases the risk. Meanwhile, Python itself does not implement any effective measures to prevent programmers from making mistakes.
Therefore, I propose adding the concept of variable declaration to Python. Below is a temporary proposal. This proposal is essentially not about specific syntax, but rather about the concept of variable declaration. The syntax itself is something that can be discussed.
I very rarely find myself in the situations you describe, so this is either something that I managed to ingrain in myself early on, or the practices I’ve built over the decades led me to a style of programming that doesn’t lead to these problems. This is not just to say that I’m not sure this would help me, but that I may be underestimating how much this would help new users.
Any proposal along these lines will have to deal with modules that do not declare variables, and the semantics there need to remain unchanged. The alternative is a massive, ecosystem-wide churn, and many old packages becoming unusable.
So what specifically do you want to introduce? JS-style const and let? Or just let? What changes occur at compile time (are there new SyntaxErrors?) and runtime? What happens if I eval code with declarations in a module without declarations, and vice-versa?
reg. S1: Try to avoid the usage of globals and non locals. If globals are really necessary, use a name scheme: e.g. g, i.e. gFoo, gBar etc. This helps to remember the special handling.
reg. S2: Do not allow a function with “hundreds of lines of codes”. If you cannot visually inspect a function on one screen, you should split it up into smaller functions.
Yes, I had and will have both issues in my code. But not so often that I want to change Python for these. And both can be mitigated with a corresponding coding style.
Test all code, so you might find the remaining issues…
Could you please provide examples, as it was suggested in the issue thread? Currently your proposal looks rather abstract.
I second on “Do not allow a function with “hundreds of lines of codes”.”
Descriptive variable names, type checks and a good text editor also helps.
Sorry, your post entirely miss that part too. Please describe this in some form. As it will change syntax — you should also address issue of backward compatibility.
Are these experiences borne of maintaining large, complex projects with multiple contributors? Testsuites? Static analyzers and type checkers? Because they really don’t sound like it.
Linters are actually very relevant to any suggestion for a change to the language – a frequent question which proposals are asked to answer on this forum is why the proposal cannot be solved with the existing ecosystem of tools.
Pylint, for example calls one of it’s pertinent rules redefined-outer-name and it simply forbids name shadowing.
I urge you to be very (read: way more) slow to call something a flaw in the language. So far your post sounds to me like it’s revealing some poor practices of yours which you could ask about in the Help category.
I could be mistaken, but that is how it comes off.
The reason I didn’t include code examples is that the description was already sufficiently clear. In my view, if one cannot identify issues through the given description, simplified code examples usually won’t provide better insights. While this problem might not cause significant issues in simple scenarios, I’ll nevertheless supplement with simplified code examples here.
Scenario 1:
a = 1
def some_func():
# Imagine many lines of code here
a = 2 # This accidentally creates a new variable. Distinguishing variable declaration from assignment would naturally prevent this issue.
·· Scenario 2:
def some_func(arg1, arg2):
# ...
arg2 = something # This overrides parameter arg2. Similarly, distinguishing declaration from assignment would naturally prevent this.
def some_func(arg1, arg2): # … arg2 = something # This overrides parameter arg2. Similarly, distinguishing declaration from assignment would naturally prevent this.
I should clarify: I’m fully aware of basic syntax like global and nonlocal. My point is precisely that these mechanisms are error-prone. Nor am I ignorant of so-called “best practices”. As someone who regularly works with multiple programming languages, I consider the failure to distinguish variable declaration from assignment to be a language design flaw in modern programming languages. As demonstrated above, this creates unique problems in such languages. What I seek is a solution to prevent variable redefinition. I hope we can agree that distinguishing declaration from assignment would inherently prevent certain categories of errors, eliminating the need to rely on “best practices” to avoid pitfalls, thereby reducing mental overhead. Without this fundamental consensus, meaningful discussion becomes difficult.
Regarding the claim that “a function shouldn’t exceed 100 lines”, I’ve yet to encounter projects that strictly adhere to this guideline. Even if they exist, I wonder: what percentage of Python projects actually follow this rule?
You’ve drawn a false equivalence between “there’s an error which a Python user can make” and “there’s an action which the Python language should take”.
The two are not the same, and if this thread is to go anywhere, you need to shake free from that assumption.
You aren’t being asked for examples to explain how name shadowing works.
You’re being asked for evidence that this is a problem that the language needs to change to solve.
I agree on some particulars, e.g., that languages with stricter syntax prevent certain flavors of errors. But I utterly disagree with your conclusions, e.g., that the assignment semantics of Python incur significant mental overhead for typical users.
One of the drivers of Python’s popularity is the extremely minimal “no fuss” syntax it uses. x = 1 assigns an int. If you want to change this, you need to make a convincing argument.
Maybe I missed it, did you make a concrete proposal? Syntax isn’t that important, but semantics are. What changes are you asking for?
Keep in mind: very few ideas discussed in this forum make it to a formal proposal stage, and fewer still get adopted. It is not easy to change a language.
These are the kinds of questions your proposal should address.
Type hints already distinguish declarations from assignments, and type checkers complain about multiple declarations in the same scope, even if the types are consistent:
$ cat example.py
a: int = 0
a: int = 1
$ mypy example.py
example.py:2: error: Name "a" already defined on line 1 [no-redef]
Found 1 error in 1 file (checked 1 source file)
I’m not aware of any typecheckers which have an opt-in rule to disallow variable type inferencing entirely (since it would be spectacularly annoying in practice), but adding such a rule wouldn’t require a language change, just an amenable typechecker development team.
For example, Python emphasizes having one obvious way to accomplish a task. While some may see the existence of multiple ways to achieve the same result in other languages as a design flaw, this critique may be overly harsh. It reflects a deliberate design choice. Python, however, follows its own philosophy, grounded in sound reasoning and clarity.
You might try making global variables all capitals - which is done by many people, and so easy to understand. You could also seek out an IDE with features which help with your common problems.
There are many people who code without ever writing a thousand line Python file, many that write 100 lines or less of code to solve, what for them are important problems. Each language take their own decisions that lead to unique trade-offs. Try looking at how your suggestion might affect others - in fact you have, by posting here!
In scenario 1, if you add a way to declare local variables, it should not be required, otherwise it will break all existing Python code. You can still inadvertently introduce a local variable through assignment.
In scenario 2, you can catch an error when trying to declare an already existing variable, but you cannot catch an error when re-assigning your variable in other part of the 100-line function that creates a variable through assignment. And you cannot catch the missed declaration – the code without explicit declaration should still work the same way.
This is not a good take. This request will not happen for philosophical reasons, not technical. Presently technical reasons just encourages folks to come up with technical solutions. e.g. an alt-python language could introduce pragmas a al Perl:
pragma strict
def fictitious():
x = 7 #raise syntax error
int x = 7 #valid in alt-python
Of course, this will not happen because it’s not Python.
To me the idea looks like a way of introducing a new category of errors - forbidden operations on uninitialized variables.Python does not have “hanging” symbols - if there is a symbol, that means there is a value behind it .
Both scenarios present badly written code - often modifying values not present in locals() and huge functions.Both are rarely needed, or at least should be.
The solution for bad code is good code, no matter how trivial it sounds.
I know too well that the code we have to work with is sometimes a hot mess. But declarations are not a good idea.
… an alt-python language could introduce pragmas a al Perl:
pragma strict
def fictitious():
x = 7 #raise syntax error
int x = 7 #valid in alt-python
Of course, this will not happen because it’s not Python.
This did already happen years ago in the form of type checkes (you just write x:int = 7 instead of int x = 7). There is no need for an alt-python in order to archive this.
Sure, and we have dataclass declarations with normally inaccessible members. And docstrings working as function body and so on.
I treat these like unfortunate corner cases/problems to be fixed in the language rather than a rule. And I would happily see them go. And I would rather not multiplicate them, really.
So I should have written “Python does not have “hanging” symbols … with some edge case exceptions”.