In JSON-LD specs, a field could be basically: set | set[*] where * could be a str, HttpUrl and many other stuff. When I’m parsing such JSON, to add a new value to a field, I first need to know if the field is a a set or something else like a string. When it’s a string, basically I transform this field into a set before adding a new value. When it’s already a set, I add a new value. When the new value is a set itself, I updates the old field. So basically first, I’ll have to check for the field type before doing anything.
I decided to go simple and use TypedDict to modelize this object. Here an example of the Person type:
class Person(TypedDict, total=False):
name: str
email: EmailStr | set[EmailStr]
homeLocation: str | set[str]
alternateName: str | set[str]
description: str | set[str]
familyName: str
givenName: str
identifier: str | set[str]
image: HttpUrl | set[HttpUrl]
jobTitle: str | set[str]
knowsLanguage: (
constr(pattern=RE_LANGUAGE) |
set[constr(pattern=RE_LANGUAGE)]
)
nationality: (
constr(pattern=RE_COUNTRY) |
set[constr(pattern=RE_COUNTRY)]
)
OptOut: bool
sameAs: HttpUrl | set[HttpUrl]
url: HttpUrl
workLocation: str | set[str]
worksFor: str | set[str]
First issue: it’s impossible to add methods to TypedDict (while I could inherit from dict and do so, so I don’t understand why). So I created a simple function to do it. I already feel that I’ve done something dirty:
def person_set_field(person: Person, field: str, value: str | set) -> Person:
"""Set while transform field into set when the value or the dest is not set
WARNING: only works with set/str
Args:
person (Person): person's dict
field (str): field name to set
value (str | set): value
Returns:
Person: person's dict
"""
# quirky hack to check if one of annotation could be a set of something
is_dest_set = RE_SET.match(str(Person.__annotations__[field]))
if is_dest_set:
if field not in person:
person[field] = set()
elif not type(person[field]) is set:
person[field] = {person[field], }
if type(value) is set:
person[field] |= value
else:
person[field] |= {value, }
else:
person[field] = value
return person
Here comes the most surprising thing: I find it horribly difficult to verify if fields annotations that are set, set | str, set[str] | str, str | set[HttpUrl] could be a set. So I coined this REGEXP:
RE_SET = re.compile(r"(\s|^)set\W")
Here I feel that I’ve done something very unpythonic, shame on me! So here my question, what was the pythonic to do what I intend to do?