Changing the type of a name after first usage

Rosuav · October 31, 2023, 4:43pm

That sounds like a flaw in type checkers rather than a flaw in the language.

fonini · October 31, 2023, 5:05pm

I don’t think all type checkers require variables to have a consistent type. Looks like mypy does 1, but pyright doesn’t 2. I don’t think this is a flaw in mypy – they’re just different typing philosophies. Allowing a variable’s static type to vary throughout a single scope increases flexibility at the cost (IMO) of making code harder to reason about.

NeilGirdhar · November 1, 2023, 8:29am

I think it’s a very human expectation that the same name refers to the same type of object. Hungarian notation is founded on that concept.

I think the type checkers are right in this case.

Rosuav · November 1, 2023, 8:49am

I guess it depends on your definition of “type”, then. I often have a variable that can store a “number”, but it might be an integer or a float depending on where we are in the code.

Come to think of it, MyPy also complains at this.

x = 5
x /= 2
print(x)

However, if you’re more clear about types, this is fine:

x: int|float = 5
x /= 2
print(x)

So I guess what I really mean is that, in many contexts, union types count as “the same type”. It makes perfect sense for x to always be a number here, and you’re right, I would be quite surprised if it suddenly became an open file object; but changing between int and float is mostly irrelevant.

That said, though, there are plenty of situations where x = x.attr is perfectly valid - various data structure traversals and the like. But I still don’t think it needs syntax.

NeilGirdhar · November 1, 2023, 11:57am

FYI, you can just say x: float = 5. float is essentially shorthand for numbers.Real.

pf_moore · November 1, 2023, 12:58pm

It’s a shame that mypy can’t infer a type of float, though. If you’re going to make a point of the idea that “a name has to refer to an unchanging type” then you should make more effort to infer the intended type…

Personally, I think changing the type of a name is a perfectly reasonable pattern - in Rust, it’s shadowing - the declaration of the first x is shadowed by the new declaration. In traditional (pre-type checking) Python, it’s just called “using a variable”

IMO, if you infer incorrect types and then flag an error based on your incorrect inference, that’s a bug in the type checker. Specifically in the inference process. If I said

x: int = 5
x /= 2
print(x)

then flagging the division as a type error is reasonable. But if I say

x = 5
x /= 2
print(x)

and you flag that as a type error, then you are wrong, because I never said that x was supposed to be an integer.

Rosuav · November 1, 2023, 1:02pm

Ah fair. Further reinforces that the true “type” in the abstract sense here is not “float” or “int” but the union of both.

NeilGirdhar · November 1, 2023, 1:23pm

What do you want it to do? You want it to broaden the original type of a name whenever it sees the name being used as a broader type? Wouldn’t that create a lot of false negatives?

Type checkers implicit annotate whenever variables are initialized at declaration:

x = SomeClass(5)  # x implicitly has type SomeClass
x = SomeClass[int](5)  # x implicitly has type SomeClass[int]
x = []  # x implicitly has type list[Any]

which is why

x = 5  # x implicitly has type int (shorthand for numbers.Integral)

I think this saves a lot more effort than annotating the few times when you actually wanted a broader type.

pf_moore · November 1, 2023, 2:11pm

Not give an error on valid code, basically. Think of it as “in case of ambiguity, refuse the temptation to guess”. I want it to respect explicitly declared types, but not assume types that it can’t be sure are intended.

I’m not sure what you mean by “false negatives” here, you’ll need to give an explicit example.

I assume this depends on the type checker (and it’s a quality of implementation matter, which is why I said “it’s a shame”, not “it’s wrong”…) but I would expect the checker to infer types based on usage, not just leap on the first evidence found (the initial assignment) and assume the strictest possible type solely on that basis.

And just to be precise, none of the examples you gave are declarations. Assignments are not declarations in Python (unless they have an explicit type annotation, when I guess you could class that as a declaration).

The problem isn’t the effort required to declare your intent, it’s the time wasted when the tool reports problems that don’t actually exist. For someone familiar with type systems, it’s easy enough to add the correct annotation, sure. But for people not so familiar, working out why the code is producing an error, and what to do to address that error, is a significant exercise^[1]. And this isn’t just a “beginner problem” - I know how type systems work, and my first reaction was “why can’t the type checker deduce from the x /= 2 line that the type needs to be float?” And before you say that inferring int protects the user from mistyping x //= 2 as x /= 2, how is the type checker supposed to know that the mis-spelled operator was the mistake rather than an omitted float type declaration? Again, “in the face of ambiguity, refuse to guess”. Infer Any if you have to, but don’t try to guess the user’s intention.

Anyway, this is way off-topic. The fact that .= is likely to change the type of the target isn’t a good argument against the operator. But that’s irrelevant, because there are plenty of other strong arguments against the proposal, so let’s just focus on those, and drop the typing digression.

And the unexpected and confusing error is not a positive experience for someone new to type checkers, or uncertain of their value. ↩︎

NeilGirdhar · November 1, 2023, 2:15pm

Right. Reporting problems that don’t exist is called a false positive. A false negative is not reporting problems that do exist. Every form of checking is trying to minimizes both sets of errors.

My contention is that your idea about how things should work have a much greater cost in terms of false negatives than the false positives they would prevent.

pf_moore · November 1, 2023, 2:18pm

Understood. As I said, I don’t follow what you mean in this context by a false negative. Could you give an example? The x /= 2 example isn’t a false negative, because there is no “problem that does exist” to not get reported…

NeilGirdhar · November 1, 2023, 2:24pm

Under your idea of automatic broadening, then every variable declaration would need to have its type specified in order to detect errors. That means every line of code:

for i in range(1, 10):
   # What type is i?   You'd have to specifiy it somehwere.
   s = "abc" + i  # False negative; i is not a string.

x = defaultdict([])
# x's type would be unknown, have to specify.
x.append([12])  # False negative; x has no append method.

class C:
  def __init__(self, x: int):
    self.x = x

  def f(self):
    self.x.append(12)  # False negative.  self.x's type would be automatically broadened under your system.

etc.

So that would be a huge amount of work to specify all these types.

carljm · November 1, 2023, 4:00pm

You seem to be arguing against effectively “no type inference,” but I think this is likely a straw man. There is an alternative, which I’ve heard called “bi-directional inference,” where the type checker, when it needs to infer the type of a new variable that appears without type annotation, looks at all uses of the variable in its scope to try to infer a type that satisfies all of those uses.

This is certainly more complex to implement than uni-directional inference, but it is possible, and it does reduce false positives from overly narrow inference. On the other hand, because it is more complex it can be harder for the user to understand the behavior. For example, it can cause a change made near the end of a function (which might be legitimately a type error) to manifest as a change to an inferred type much earlier in the function, resulting in a new error whose causal relationship to the change is much less clear. For a simplified example:

def f():
    x = 1
    x += 1
    return x

If we now add the line x = “foo” right before the return, under bi directional inference you would now see an error on the x += 1 line, rather than on the newly added line. Of course we can’t say for sure which line is now wrong, but action at a distance in the reverse direction is not as simple and likely more surprising to the user.

So it is not obvious which kind of inference should be preferred.

Also note that many inference choices that seem obvious are somewhat arbitrary in their breadth. Should x = 1 result in an inference of int? Why not Literal[1]? Why not object? All are technically correct, and choosing any of them involves a judgment call about how code is most likely to be written. Bi-directional inference can make a more informed choice.

Daverball · November 1, 2023, 4:10pm

Bidirectional inference is very expensive and it has a lot of ambiguous cases where it will just have to pick one of the possible solutions. That being said, mypy does have some amount of bidirectional inference, it would be a lot more annoying to use if it didn’t. It’s just a question of how far do you want to take it? Is it worth being able to omit one type hint, if now you have to wait a second for mypy to tell you if your code is fine, how about 10 seconds on a larger code base?

Python has the advantage that the static analysis isn’t built into a compiler, so different type checkers can make different trade-offs about speed and how deep their bidirectional inference goes. In other languages such as Swift we’re not so lucky and have to instead contend with very long compile times for what I would consider a rather small benefit in ergonomics when writing the code initially. The time you safe not having to deal with false positives is quickly taken up by the amount of time you now wait on static analysis to give you back your errors.

jamestwebber · November 1, 2023, 4:11pm

Maybe the typing discussion should be split off into another thread?

CAM-Gerlach · November 1, 2023, 5:01pm

Yep; I split the relevant posts to here from '.=' assignment. Thanks!

johnthagen · November 1, 2023, 7:58pm

In Rust, you can shadow because they have an explicit let keyword that you use when you declare a new meaning for a variable.

let x = 0;
let x = "abc";

This makes it clear the programmers intent to treat that variable name as a new value/type.

In Python, we don’t have let / var etc to signal the intent that we want to declare a new variable so the same code:

x = 0
x = "abc"

Python type checkers have no information to signal if you intended to reassign the name to a new type, or if it was an accidental mistake. Thus, the current behaviour is more strict and assumes that the developer didn’t mean to change the type.

I personally think it’s the right behaviour for a Python type checker, otherwise a lot of accidental type errors could be missed.

x = 0
x = f(x) # oops, f returns a str. Did you _mean_ that?
... # operate on x assuming it is still an int

tmk · November 1, 2023, 11:07pm

Mypy allows shadowing when --allow-redefinition is set.

Example in mypy playground: mypy Playground

Resuscitating · November 3, 2023, 1:16am

x = SomeClass(5)  # x implicitly has type SomeClass

What if SomeClass doesn’t return a SomeClass object? E.g.,

class SomeClass:
    def __new__(cls, *args, **kwargs):
        return OtherClass(*args, **kwargs)

or

class SomeMeta(type):
    def __call__(self, *args, **kwargs):
        return OtherClass(*args, **kwargs)
class SomeClass(metaclass=SomeMeta):
    pass

and so on. There are a lot of ways that this could happen. Admittedly, it’s not very sound practice, but it’s certainly possible. Because of things like that, my opinion is that it’s better to not pre-emptively try to guess what the user is doing. Better to let the user annotate what they want to, and leave the rest alone.

NeilGirdhar · November 3, 2023, 2:46am

I don’t agree. I think it would be better for type checkers to understand the two patterns you showed. (You would have to annotate those methods.) I don’t think they do yet though.