Annotating a variable without specifying its type

nonagon · March 13, 2024, 4:49pm

I’ve been acquainting myself with Python’s type annotations (after way too long being stuck with much older versions of Python), and they have inspired a lot of exciting applications in my code (in my case the codebase is the backend for a medium-sized commercial web application).

Here’s one example that came up recently. I have a simple enumeration class inspired by StrEnum but with different behavior. It’s implemented with a metaclass just like StrEnum. In my case I need a way to specify one of the enumerated params as the default. Type annotations have let me do it this way:

class SearchType(EnumParam):
    NAME = 'Name'
    USER_ID: Annotated[str, Default] = 'User ID'
    BIRTHDAY = 'Birthday'

This works really well, and the exciting thing for me is that type annotations let you say something about a variable without repeating its name. It lets you say that thing where the variable is defined, which is the most natural place. The alternative approaches lack the simplicity of the above (e.g. class SearchType(EnumParam, default='USER_ID') or USER_ID = default('User ID'), etc).

So far so good, but what I’d really like is something like this:

class SearchType(EnumParam):
    NAME = 'Name'
    USER_ID: Default = 'User ID'
    BIRTHDAY = 'Birthday'

The problem is that type annotations currently require you to specify a type when adding annotations. I don’t want to specify types in this case - that’s a much longer discussion but I’ve run into trouble trying to strictly type throughout my code base (I come from a C++ background so the idea had very strong appeal to me! I just ran into some practical challenges with it). More to the point - it’s always been crystal clear that type annotations will never be forced on a Python developer nor will they be enforced at runtime.

So my idea is simply a way to annotate variables without being forced to specify a type. This could be a single-argument form of Annotated that just has an annotation, or perhaps a new member of the typing module with a different name. I haven’t contributed to python’s implementation but I have to imagine that this is a pretty trivial addition, and the only requirement outside Python itself would be that static type checkers know to ignore these annotations entirely. Just like with Annotated, any tooling which doesn’t recognize the form of an annotation would ignore it.

I understand that there hasn’t been much enthusiasm for non-type applications of Python’s type annotations. I looked into FastAPI’s use of Annotated to specify things like ranges for variables. In these cases Annotated is saying something beyond an object’s type - something that further specifies its behavior or allowed values. I see this as an elegant application of Annotated but my question is why Python forces the developer to specify a type in order to be allowed to state these other things about the variable?

Type hints have managed to sit elegantly on top of Python without affecting those who opt out of using static type analysis. I think it would be consistent with that philosophy to allow the annotation of variables as described above without being forced to also specify a type (and thus change the behavior of linting and auto-complete in the IDE for example). This is all toward making the definition of variables more expressive which helps make code both clear and concise.

storchaka · March 13, 2024, 4:58pm

You can specify this in other ways:

    USER_ID = Default('User ID')

or

    USER_ID = 'User ID', Default

or

    __default__ = USER_ID = 'User ID'

or

class SearchType(EnumParam, default='USER_ID'):

or event simply

SearchType.set_default(SearchType.USER_ID)

It is not necessary that using the type annotation is the best (or even correct) way.

nonagon · March 13, 2024, 5:18pm

Yes, agreed that there are lots of ways to do this. Most of the examples you gave (and most of my ideas) require that the identifier name be typed again. I view this is a pretty big drawback - it’s not concise, and it moves the expression of default away from where the variable is defined.

The other approaches include wrapping the value in Default() and combining it as a tuple. These appear coupled with the variable’s value. To me these are much less clear - the idea of “default” is something that has to do with the variable identity itself, not its value.

I think of “default” as metadata attached to the variable, and the natural way to attach metadata to a variable at definition time is with annotations. I realize this is somewhat subjective and it’s not exactly what annotations were designed for, but to me it’s conceptually related. I’m curious if anyone agrees though!

pf_moore · March 13, 2024, 5:48pm

If you want to use Annotated, but don’t want to specify types, the type Any is probably what you want to use. I’m not personally sufficiently enthusiastic about Annotated to be interested in supporting it outside of typing contexts.

nonagon · March 13, 2024, 5:56pm

I was using Any at first, but it ruins autocomplete in my IDE. There’s a static type analyzer running to give really amazing autocomplete and Any overrides its type inference.

So I looked pretty hard for a transparent type that wouldn’t affect type checkers, but it seems that there aren’t any!

MegaIng · March 13, 2024, 6:57pm

See also this proposal of mine: An alternative to Annotated - #14 by Jacob-Stevens-Haas

blhsing · March 13, 2024, 10:27pm

I like the idea of making Annotated accept 1 parameter just so that it looks cleaner, without the noise of an Any:

NAME: Annotated[DEFAULT]

But at least for your specific use case, you can also use a metaclass with a custom namespace to realize an even simpler usage:

class EnumParamName:
    def __init__(self, namespace, name):
        self.namespace = namespace
        self.name = name

    def __setitem__(self, key, value):
        self.namespace.setdefault('_annotations', {})[key.name] = self.name
        self.namespace[self.name] = value

class EnumParamNamespace(dict):
    def __getitem__(self, name):
        if name in self:
            return super().__getitem__(name)
        return EnumParamName(self, name)

class EnumParamMeta(type):
    @classmethod
    def __prepare__(cls, name, bases):
        return EnumParamNamespace()

so that:

class SearchType(metaclass=EnumParamMeta):
    NAME = 'Name'
    USER_ID [DEFAULT] = 'User ID'

print(SearchType.NAME) # outputs Name
print(SearchType.USER_ID) # outputs User ID
print(SearchType._annotations['DEFAULT']) # outputs USER_ID

Demo: s6cDzv - Online Python3 Interpreter & Debugging Tool - Ideone.com

The IDE will likely complain about USER_ID and DEFAULT being undefined though.

a-reich · March 13, 2024, 10:47pm

You say that you don’t want to use annotations for types, but OTOH you also say that you don’t like the Any solution because you use a static analyzer that gives you information based on… the types it infers from annotations. If the annotations don’t have the right type for your attribute, your tools don’t work the way you want.
I’m not sure how to reconcile these?

MegaIng · March 13, 2024, 11:09pm

IDEs use annotations if present, otherwise infer the types. This means annotations that aren’t type hints hurt those tools and they would work better if they weren’t there at all. This prevents this really convenient syntax from being used for other stuff.

blhsing · March 14, 2024, 1:37am

You can also simply use a type alias for the purpose (validated with pyright and mypy):

from typing import Annotated, Any

type Default = Annotated[Any, 'DEFAULT']

class SearchType:
    NAME = 'Name'
    USER_ID: Default = 'User ID'

To get the default SearchType:

print(next(name for name, annotation in SearchType.__annotations__.items()
    if annotation == Default)) # outputs USER_ID

Now you just need to switch to an autocomplete tool that isn’t thrown off by Any.

MegaIng · March 14, 2024, 2:26am

That would reduce the quality for third party libraries since it essentially means choosing an autocorrect tool that ignores type hinting.

What exactly is the problem with acknowledging that is a drawback of the current annotation ecosystem where type hints are so much the core that using them for something else is basically impossible?

You don’t have to agree that this is a drawback that needs to be fixed, but don’t provide incomplete/wrong workarounds.

blhsing · March 14, 2024, 2:31am

Yes, I can fully get behind the idea of officially supporting the use of the annotation syntax without involving a type hint.

My workarounds are simply ideas that the OP may be able to adopt for the time being.

nonagon · March 14, 2024, 2:14pm

Regarding my use of annotations vs static analyzer usage - as Cornelius said, the analyzers do a good job of inference based on all sorts of things (type stubs, variable values, control flow, etc). The autocomplete has gotten worlds better now that Python supports type annotations and it’s amazing! I’m basically autocompleting every token that I’m not defining, which is critical to my productivity.

I do in fact use type annotations… but I only use them when I want to (i.e. when I think it’s necessary or worth it). Annotating a type as Any should tell any properly written static analyzer to change autocomplete, since I’m overriding any inference it’s made. So I don’t think I can find a better analyzer to fix this, I simply need a way to avoid being forced into making type annotations I don’t want to make.

nonagon · March 14, 2024, 3:03pm

The custom namespace is pretty cool, I hadn’t seen that before. I’m actually using a type alias now and pyright is happy with it. To avoid ruining autocomplete I need a separate type alias for each type of variable I’m decorating though, which is unfortunate.

I have a few other applications for type annotations that are more involved / harder to explain where type aliases won’t work at all. Some kind of single-argument Annotated or a transparent (“static type checkers ignore please”) type would solve all of my use cases pretty elegantly though.

I’ve been looking around for other examples where developers had to work around the requirement that a type be defined in order to attach metadata to a variable. I’m new to them so forgive me if this is a flawed argument - but one example might be dataclasses from the standard library. They require a type annotation to indicate that a class var is a field. They don’t make much use of the types themselves:

With two exceptions described below, nothing in dataclass() examines the type specified in the variable annotation.

Those two exceptions were interesting! The first is that dataclass needs a way to let you annotate a class variable with a type but not automatically enroll it as a field. I’d say they got lucky that ClassVar existed for a different purpose (to prevent class vars from becoming instance vars) but dataclasses happen to use class vars too, so ClassVar was available as a sort of “transparent” type that a developer could always add, even though I think this is an abuse of ClassVar. In other words if I use ClassVar to tell dataclass not to pick a variable up as a field, a reader of my code who is familiar with ClassVar (as documented in the typing module) would be rather confused.

The second example is somewhat relevant here as well: dataclass needs a way to mark fields as init-only, so they provide InitVar which is a generic type that wraps the actual type of the class var. This isn’t ideal, in fact it ruins autocomplete as well in pyright (using the example from the dataclass docs, reveal_type(C.database) shows Any because InitVar confused the type checker). I think what the dataclass authors want is a way to add a boolean flag to the field type definition which will be ignored by static type analyzers but picked up by their code at runtime. This is exactly my use case too.

Clearly a lot of work went into making type annotations available at runtime. The Python devs had to tackle performance considerations and a way to support postponed evaluation, and more syntax (__annotations__ etc). I looked around for code that makes use of this runtime support, and it always seems somewhat distinct from strict formal type analysis. So I guess in summary I’m surprised that there has been so much pushback when I’m just trying to get value from this runtime support in a way that feels (at least to me) very Pythonic!

alicederyn · March 14, 2024, 4:02pm

I think that what is being given is advice rather than pushback. The issue is that addressing the root issue must be done via a PEP (Python Enhancement Proposal), and it would be on someone actually experiencing the pain to dedicate the time to that. Which you can totally do! I have made it through that process recently and I have lots of positive things to say about it. But you’re not going to just see an issue like this, point it out, and have someone else go “good point! let’s fix it immediately” and submit a PR. Especially here, where this would affect lots of tools, and where this was deliberately disallowed on the PEP that introduced Annotated.

alicederyn · March 14, 2024, 4:07pm

Incidentally, I’m curious why you want to use Annotated at all? As in, why not propose a mechanism whereby you can use a non-type annotation directly and have type checkers ignore it? Perhaps a base class that serves to tag them.

nonagon · March 14, 2024, 4:35pm

Cool yes I’d be more than happy to create a PEP. I figured that would be required at some point. I was just surprised that so many people (in other forums) considered my applications to be orthogonal to the intended use of type annotations. To me they’re all analogous to other use cases that are out there (e.g. the dataclass issues I described above), and there’s no satisfying way to circumvent the requirement that a formal type be specified to invoke the annotation syntax.

I’ve tried a long list of things to avoid Annotated. They all work to some extent, but for example a common base class isn’t great when many of my variables are simple integer or string literals. So if I’m used to writing timeout = 30 then it’s hard to be happy with timeout = SomeIntDerivative(30) in terms of simplicity or clarity.

I think conceptually the metadata belongs with the name at definition time and most of the workarounds involve doing something to tag the value.

lawsonjl · March 14, 2024, 5:17pm

I don’t think the suggestion of a single-arg Annotated[Value] is a good idea, because it looks too much like it means Annotated[Type] - a type annotated with 0 values. Probably nobody would intentionally write something like that, but it could conceivably come up in a synthesized type.^[1] And you’d have to decide if type checkers should interpret that as an explicit Any, or if they’re allowed to use type inference there.

The “transparent type” (typing.Infer, maybe?) is a much better idea, because it composes with what’s already available instead of adding a special case to an already fairly-special form. I think it should mean “whatever type the type checker would have inferred, had there not been an annotation here”.

Kind of like how Union[T] could appear from a construction like def one_of(*values: *Ts) -> Union[*Ts]: ... ↩︎

alicederyn · March 14, 2024, 5:21pm

To be clear, I meant a PEP proposing a common base class for non-type-related annotations, so you can just do your original ideal code:

Default here would inherit from this hypothetical base class.

My main concern with this proposal is that it only solves the single annotation case. What if I need two? I’m back to requiring Any again

nonagon · March 14, 2024, 6:02pm

Hmm yes I was hesitant to propose anything specific since names are so hard to get right for this stuff. How about Annotation[...] (i.e. takes multiple args for multiple annotations, although single-arg ones could be union’d I suppose) which is simply the annotation part of Annotated without the type? That way it doesn’t invoke any types at all:

timeout: Annotation[range(0, 30)] = 20

and then I’d alias it in this particular case (type Default = Annotation[SomeSentinelClass]) but I think it would support all of the cases I’ve seen in my code and in public libraries.