Is there a nice way to dynamically populate a TypedDict instance?

(I wasn’t sure if this should go here or in the general Help category, so I don’t mind if the category mods think it should be moved over)

I’m switching some JSON-related code over to running mypy in strict mode and ran into trouble with the following snippet (_hash_file is just a convenience wrapper around hashlib.file_digest):

class ArchiveHashes(TypedDict):
    sha256: str
    # Only SHA256 hashes for now.
    # Mark old hash fields with `typing.NotRequired`
    # to migrate to a different hashing function in the future.

def _hash_archive(archive_path: Path) -> ArchiveHashes:
        hashes:dict[str, str] = {}
        for algorithm in ArchiveHashes.__required_keys__:
            hashes[algorithm] = _hash_file(archive_path, algorithm)
        # The required keys have been set, but mypy doesn't know that, and
        # there's no `assert` that can be used to remedy the lack...
        return hashes # type: ignore[return-value]

The comment in the code summarises my question: is there a way to avoid hitting mypy over the head with the # type: ignore directive and instead satisfy it that the required keys in ArchiveHashes have been set without having to hardcode them?

(the assert isinstance(hashes, ArchiveHashes) approach I’d otherwise use to work around type inference failures doesn’t apply for TypedDict)

You could write a TypeGuard function that verifies the key is present[1]:

def is_archive_hashes(x: dict[str, str]) -> TypeGuard[ArchiveHashes]:
    return 'sha256' in x

Although mypy may and probably should complain about that type guard definition, since dict is invariant and TypedDict without PEP-728 doesn’t provide any guarantees about the extra keys, so you can’t really safely cast from dict[str, str] to ArchiveHashes.


  1. You shouldn’t use TypeIs here, since it is not a safe TypeIs guard ↩︎

1 Like

After hitting a similar problem with a json.load call elsewhere in the code, I realised that typing.cast covers exactly this situation (i.e. telling the typechecker to assume the value conforms to the type, even if that can’t be proved via static analysis).

    @staticmethod
    def _hash_archive(archive_path: Path) -> ArchiveHashes:
        hashes:dict[str, str] = {}
        for algorithm in ArchiveHashes.__required_keys__:
            hashes[algorithm] = _hash_file(archive_path, algorithm)
        # The required keys have been set, but mypy can't prove that,
        # so use an explicit cast to allow it to make that assumption
        return cast(ArchiveHashes, hashes)

It’s horrendously unsafe, but still nicer than the # type: ignore directive.

2 Likes