Safe_cast (again)

NeilGirdhar · March 18, 2025, 7:51am

This was discussed before, but I’ve run into some new problems with my safe_cast. What I want to do is:

        total = sum(safe_cast(int | float, task.total) for task in self.tasks)

where

def safe_cast[T](typ: type[T], val: Any) -> T:
    """An inline version of isinstance."""
    assert isinstance(val, typ)
    return val

This work fines where typ is a type, but doesn’t work for unions. Is there a way to make this work for unions, or do I have to transform this into a loop?

Daverball · March 18, 2025, 8:04am

You would need PEP 747. A custom cast function is one of the very first motivating examples for that PEP.

But given, that isinstance is only going to work for a small subset of types and the support for UnionType and runtime_checkable Protocols are largely considered to be a mistake in hindsight, due to their poor performance and the case of runtime checkable protocols, unreliable runtime semantics, I’m not fully convinced that a naïve safe_cast function like this is actually particularly desirable.

That being said, I get that the int | float case in particular, is going to be annoying to deal with, since within the type system int is considered a subtype of float, but that’s not true at runtime, so you will need to check both in order to preserve the same semantics. You could special case float and complex in your safe_cast implementation or write separate functions for casting to float/complex in order to deal with that.

Edit: I got my wires crossed in the subtype relationship between complex/float/int

NeilGirdhar · March 18, 2025, 8:10am

safe_cast works wherever isinstance works, so it’s just the inline loop version of isinstance. It doesn’t need to work with every type. Tiny differences in performance are not relevant to me.

Daverball · March 18, 2025, 8:14am

Yes, but my point is, that isinstance isn’t actually particularly safe, when it comes to e.g. runtime checkable protocols. So the fact, that isinstance can lie as far as the type system is concerned, makes it unsuitable for verifying anything other than normal type-objects and can lead to subtle bugs.

Performance is just another thing that makes it undesirable, especially if used in a loop, since you’re compounding the effect.

NeilGirdhar · March 18, 2025, 8:19am

That’s not a problem with isinstance—that’s a problem with protocols. If you want safety, you can use an actual ABC.

Anyway, isinstance is the standard way of gating types in Python, so I don’t understand your point here.

Again, this isn’t relevant to me. If it was, I wouldn’t be using Python for this.

mikeshardmind · March 18, 2025, 12:56pm

This whole idea is unsafe, and an ABC isn’t any safer. ABCs are somewhat less safe than protocols because of additional runtime behavior and the registry capabilities, and they do not enforce a type any more than protocols do.

There’s no point in your safe cast function here. It isn’t particularly safer, it’s actually causing your static analysis to be erased for an assert at runtime which can be disabled. I’d focus on fixing the typing of self.tasks here instead such that this isn’t needed.

JamesParrott · March 18, 2025, 1:52pm

I can get it to work, and typecheck, for a specific safe_cast for int | float (using get_args to turn the union into a tuple of types for isinstance).

Is it a hard requirement that safe_cast is generic over type vars that can be unions? That needs a bit more work, and may not be possible at all.

Personally I would refactor the validation of .total into __init__(self) too. Why should selfs exist with invalid (sub) totals?

NeilGirdhar · March 18, 2025, 6:46pm

I don’t know what you mean. There’s nothing “unsafe” about an inline isinstance, which is what’s proposed.

I mentioned ABCs only in response to the suggestion that using isinstance with runtime-checkable protocols is “unsafe” since they can produce false positives. Using isinstance with ABCs does not produce false positives. Therefore, we don’t need to consider isinstance problematic to use. In any case, I don’t agree with the viewpoint that “isinstance isn’t particularly safe”.

The purpose is to recast a for loop into a comprehension. Maybe you were confused by the name?

I simply want to write:

 total = sum(safe_cast(int | float, task.total) for task in self.tasks)

instead of

        total = 0.0
        for task in self.tasks:
            assert task.total is not None
            total += task.total

I guess this works for my case:

def non_none_guard[T](x: T | None, /) -> T:
  assert x is not None
  return x

 total = sum(non_none_guard(task.total) for task in self.tasks)

It’s just like assert isinstance.

The type cannot be “fixed”. It includes None (and it’s not in my code anyway), so it needs to be gated.

If I didn’t need it to be generic over unions, I would just use the code above.

They’re not invalid. They may be None for some tasks as per the library I’m using. I don’t have control over the library, and I think the library design is fine.

oscarbenjamin · March 18, 2025, 7:06pm

If the purpose is just to get a runtime error when there is a None then do you need to do anything at all?

>>> sum([1, 2, None])
...
TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'

NeilGirdhar · March 18, 2025, 7:07pm

The purpose is to:

get a runtime error and to maintain static type checking—just like it would be if I had used isinstance, while
having the convenience of using a comprehension rather than a loop.

In retrospect, I should have written these two bullet points at the top of my post. Sorry for the confusion.

oscarbenjamin · March 18, 2025, 7:32pm

Having a function that blows up on None or other types does not really do anything to maintain static type checking. Type checkers don’t keep track of exceptions but it doesn’t mean that they don’t matter e.g. a type checker is happy with this function even though any code that uses it is going to fail at runtime:

def f(x: int) -> str:
    assert False

The fact that you have satisfied the type checker with assert x is not None does not mean that the type checker is doing anything useful.

I don’t see how any of this is better than type: ignore:

total: float = sum(task.total for task in self.tasks) # type: ignore

That seems most convenient and offers equivalent safety in terms of type checking. I don’t think the safe_cast or non_none_guard functions improve anything.

oscarbenjamin · March 18, 2025, 7:37pm

This can do what you wanted:

from typing import Any

def safe_cast[T1,T2](typ1: type[T1], typ2: type[T2], val: Any) -> T1|T2:
    """An inline version of isinstance."""
    assert isinstance(val, (typ1, typ2))
    return val

e = 1 + safe_cast(int, float, None)

Both mypy and pyright are happy with it. Naturally it doesn’t work at actual runtime though.

NeilGirdhar · March 18, 2025, 7:39pm

I prefer avoiding type: ignore:

First, it’s a catch-all (even if you narrow it)—what if task.total is renamed? Or what if self.tasks is no longer iterable, but a callable that returns an iterable? We’d like to keep type checking on the whole line.
Second, readers will always wonder why it’s there since it shows no intent. We should not just be saying what we’re ignoring, but why.

Also, I find the error produced by isinstance(x, None) to be more readable than a type error on the sum. Checking the invariant may not be a big deal with sum, but it can be a big deal when you’re passing values into an interface that doesn’t check types at runtime, and may fail in some deep call stack.

In short, the benefit is writing clear, intentional code: You are declaring that nothing in the comprehension should be None. At runtime, you’re checking the invariant first, and then doing some operation that relies on the invariant.

JamesParrott · March 18, 2025, 8:30pm

Sure. But is an Assertion error the best way of handling tasks that might have good reason to have task.total = None?

The purpose is to:

get a runtime error and to maintain static type checking—just like it would be if I had used isinstance, while

having the convenience of using a comprehension rather than a loop.

Sorry to repeat myself, but those are both achievable. It’s making the type guard function generic and accepting unions that’s the issue.

I think the concept of numerical addition of scalars is already generic enough to justify writing a dedicated non-generic function to test for ‘numbers’ (just to make sure strings and lists etc. that also support + don’t creep in), instead of starting a battle with the type system or re-engineering Python’s type machinery.

NeilGirdhar · March 18, 2025, 8:37pm

Yes it is a good way of handling this since there’s no good reason for it to be None.

I’m not proposing “re-engineering” anything. Adding an inline isinstance is not re-engineering. It’s just a normal extension.

I never proposed “making the type guard function generic”. I’m not even sure what you mean by that since my idea has nothing to do with type guards. (Type guards are Boolean-valued functions that evaluate type membership for complex types at runtime, and whose output can be used by static type checkers in conditionals.)

I’m not sure how you want to achieve those things. I made a proposal that does that. I also think your specialization to numbers is too niche.

mikeshardmind · March 18, 2025, 9:23pm

Then you shouldn’t be using an assert. Asserts are for things that are invariantly true and that there is a programming error if they ever become false, and this won’t error how you expect if anyone runs the code with -O

NeilGirdhar · March 18, 2025, 9:25pm

It is something that’s “invariably true”. What I mean is that they are annotated in the library as None | float, but at this point in the code, None should be impossible. Hence the assertion.

mikeshardmind · March 18, 2025, 9:27pm

Then just do

sum(total for task in self.tasks if (total := task.total) is not None)

you don’t need an inline assert to remove the None possibility.

NeilGirdhar · March 18, 2025, 9:29pm

I want to assert that totals are non-none—not ignore them if for some unexpected reason they are none.

ntessore · March 18, 2025, 10:38pm

I found it interesting that mypy understands this correctly:

@overload
def none_guard(x: None, /) -> NoReturn: ...
@overload
def none_guard[T](x: T, /) -> T: ...

def none_guard(x: Any, /) -> Any:
    assert x is not None
    return x

def f(x: int | float | None) -> None:
    reveal_type(none_guard(x))  # int | float

But pyright reveals the type to be int | float | None.