I’m learning fastapi and pydantic these days at work, and I’m interested in solving this design problem I have:
I have a general class MyEntry whose aim is to hold some values (in particular two important ones: a and b). For instance:
from pydantic import BaseModel
class MyEntry(BaseModel):
# naive attempt to define type hints and some sort of validation
# I wish to be more granular with the hints and to provide custom validatio based on the the type
a: str | int | bool | list | None = None
b: = str | int | bool | list | None = None
def is_filled(self):
return ((self.a is not None) and (self.a != "")) or (
(self.b is not None) and (self.b != "")
)
Now I wish to define a custom entity/model for some business objects, for instance:
class MyPerson(BaseModel):
name: MyEntry = MyEntry()
age: MyEntry = MyEntry()
The problem is that I wish to have more specialized type hints and validation logics, for instance name should be of type str and should be checked to be less than 10 chars, and the age should be of type int and should be checked to be positive.
The class variables a and b both thold the same “business values” but a will be written by an user, while b will be parsed by an LLM.
I wish to design and structure this rightly at the start in order to be flexible later. For instance, I might enforce the validation on the creation of such entities only on the attribute a at first, and b later, or viceversa. How can I accomplish that with a nice pythonic code?
One idea of mine was to solve the problem like this, but I don’t know if I should define so many classes and stuff or I can be smarter with some pythonic concepts.
class Name(BaseModel):
value: str | None = None
class NameEntry(MyEntry):
a: Name = Name()
b: Name = Name()
def is_filled():
# custom validation logic
# still neeed to learn pyndatic methods and concepts (:
pass
I don’t fully understand the problem you’re trying to solve here so I will try to help with what I can.
If you want a and b to be class variables you should wrap their type in typing.ClassVar to tell pydantic that these are class variables:
from typing import ClassVar
from pydantic import BaseModel
class MyEntry(BaseModel):
a: ClassVar[str | int | bool | list | None] = None
b: ClassVar[str | int | bool | list | None] = None
Regarding custom validators I think you can write your own, like in this page, but I haven’t really tried that myself so I can’t say if it works well or not. @samuelcolvin is fairly active on this forum so maybe he has something to add?
Jacob - I don’t think that’s a good idea in this case. Those variables are intended to function as instance variables, not class variables. I don’t know if it would break things to add ‘ClassVar’, but (in the best case) it can only add unnecessary complications to the code. Pydantic classes (BaseModel) already have code in place to deal correctly with the type hints - they work similar to dataclasses.
Personally I would be strongly in favor of your last idea. I think that is a pythonic way of tackling this problem
The problem is mainly a design problem. As I understand it, you mainly want to balance the (sometimes) conflicting values of simplicity (and clarity) versus flexibility (keeping options open for change later).
Other things being equal, I think it’s best to let simplicity win, since you never know if you will really need those extra options (or what exactly they may be). Simplicity also directly impacts code maintainability – which is a terribly important aspect often forgotten by people.
In this case, I think, you also want to offload as much work as possible to pydantic. Which means that you (should) want to minimize any custom validation code. Which means that if you need it, it may be best to compartementalize it as much as possible (in separate helper classes derived from BaseModel, exactly as you did at the end). Even a relatively simple function like “is_filled” in your initial MyEntry code seems too complex to me, too “messy” (it also doesn’t deal with all the possible types for ‘a’ and ‘b’; moreoever, it defeats the purpose of letting pydantic do the base validations). So, as I see it, those extra helper classes (which you considered as alternative) may add some extra bulk and some extra layers to the code, but ultimately they will make the code simpler (and just because of that also easier to maintain and change later).
from typing_extensions import Annotated
from typing import Optional
from pydantic import BaseModel, StringConstraints, ValidationError, Field
class ShortStr(BaseModel):
bar: Annotated[Optional[str], StringConstraints(max_length=10)] = None
class PositiveInt(BaseModel):
bar: Annotated[Optional[int], Field(strict=True, gt=0)] = None
if __name__ == '__main__':
foo_0 = ShortStr()
foo_1 = ShortStr(bar='aaa')
try:
foo_2 = ShortStr(bar='aaaabbbbccc')
except ValidationError as _:
print(_)
foo_3 = PositiveInt()
foo_4 = PositiveInt(bar=1)
try:
foo_5 = PositiveInt(bar=0)
except ValidationError as _:
print(_)```
I didn’t understand what you are trying to do with the a and b, but guessing from your is_filled maybe what you want is to impose some conditions that relate both of these fields. This can be done with model_validator, for example
from typing import Optional
from pydantic import BaseModel, ValidationError, model_validator
class AtLeastOneSet(BaseModel):
a: Optional[int] = None
b: Optional[str] = None
@model_validator(mode='after')
@classmethod
def _a_or_b_set(cls, values):
if values.a is None and values.b in (None, ''):
raise ValueError(
f'At least one of `a` or `b` must be set.'
)
return values
if __name__ == '__main__':
ab_1 = AtLeastOneSet(a=111, b='222')
ab_2 = AtLeastOneSet(a=111)
ab_3 = AtLeastOneSet(b='222')
try:
ab_4 = AtLeastOneSet(b='')
except ValidationError as _:
print(_)
try:
ab_5 = AtLeastOneSet()
except ValidationError as _:
print(_)
This is exactly what I had in mind - except I was too lazy to look up the technical details (and it has been too long ago for me to remember this) I think this is the way to base a custom validation model on pydantic and offload the grunt work to pydantic.
What is left is how should I design the base class MyEntry. To given more context, this class aims to hold two instance variables: a and b. Depending on the context, the values of a and b maybe be ShortStr or PositiveInt or some other custom field class with a custom validation logic and a custom is_filled method.
I wish now to define a “base” class that abstract these cases in order to define a “is_filled” method and other shared “protocols”. Moreover I wish to be able to enable/disable the validation logic on an instance of this class (the idea is that today I wish to enforce the validation on a but not on b or viceversa (for example, if a holds a human-edited value I wish to enforce the logic, otherwise if I know the value is parsed from some LLM I wish not to do it, for now.).
For instance, should I design like this:
from typing_extensions import Annotated
from typing import Optional
from pydantic import BaseModel, StringConstraints, ValidationError, Field
class ShortStr(BaseModel):
value: Annotated[Optional[str], StringConstraints(max_length=10)] = None
def is_filled(self):
return self.value is not None and self.value is not ""
class PositiveInt(BaseModel):
value: Annotated[Optional[int], Field(strict=True, gt=0)] = None
def is_filled(self):
return self.value is not None
class MyEntry(BaseModel):
# is this needed if I want to overwrite the "types" in the below classes?
a: ShortStr | PositiveInt | bool | list | None = None
b: ShortStr | PositiveInt | bool | list | None = None
def is_filled():
return a.is_filled() or b.is_filled()
# does something like this exist?
@pyndatic.enable_validation("a")
class ShortStrEntry(MyEntry)
# a needs validation, b does not
a: ShortStr | None = None
b: ShortStr | None = None
# can I remove this since the logic is taken from the parent class?
def is_filled():
return a.is_filled() or b.is_filled()
# does something like this exist?
@pyndatic.enable_validation("b")
class PositiveIntEntry(MyEntry):
# b needs validation, a does not
a: PositiveInt | None = None
b: PositiveInt | None = None
# can I remove this since the logic is taken from the parent class?
def is_filled():
return a.is_filled() or b.is_filled()