Best practice for type checking and assert statement

Hi,

I would like to learn the ideas and concepts around how to code and/or design Python programs in order to define type hints, check type of inputs and/or if the inputs do exist.

For instance, image I need to define a function that takes a list of strings as inputs.

def fun(input: list[str]):
    if input and isinstance(input, str):
        input = [input]
    assert isistance(input, list), "input should be a list"

Now, is that example a best practice on how to define a function?

  1. The first input is a nice code design? I image there is the concept of overloading a method in Python, should I solve the problem by overloading the function?

  2. Is the assert statement best practice on how to enforce type checking? Should I use it in production?

  3. In my example, if the input is a list of int, the checks pass. How can I check for a list of str also? Can the type hints be used to employ these type of checks automatically in runtime with some sort of decorator? Would I want to do this kind of checks in Python and/or in production?

  4. How would I check if the input do exists? Should I check for None? Should I use None at all, or is it best practice to design the code to have default values without using None which may “represent” different states of the program and thus losing the information on how to discriminate between these states? Should I prefer to use, in this example, for instance, the default []?

Any idea or concept is welcome, I am new to learn Python.

Hi @lmaurelli - I really appreciate your thoughtfulness. The questions are so good, that I’m tempted to say: Nobody should help you, you will be able to figure this out and form your own opinions :rofl:

But - since you asked - I’d have the following PR comments:

  • Don’t use “input”. Never us the names of built-in functions as argument names; using it is valid Python code (unfortunately!) but imo it’s bad style since you’re shadowing the built-in name now. Linters will also error on it. For instance pylint will give a redefined-builtin error.
  • If the input arg can be either a list or a string, and you’re going to use type hints, then use the type hints to make that clear. For instance:
def fun(lis: list[str] | str):     # or Union[List[str], str] - for earlier Python versions
      ...
  • Assertions are a debugging tool and imo not the best way to enforce compliance. A user could turn them off for instance (python -O your_script.py)!
    If you really want to enforce compliance, you should check the type and raise a ValueError or a TypeError (a TypeError is probably most appropriate in this case). The error message should also reflect that the input arg should either be a list or a string, if you really want to be precise and more user-friendly.
  • Type hints are imo a kind of scam in Python since they have no impact on the actual runtime code. They’re useful in large projects to enhance readability and have better support for static checking, but they will not prevent runtime mishaps. So, if you want to ensure runtime checks, you either have to make those checks explicitly in the code or use a special decorator. (The decorator could be made kind of invisible and could be made to use the type hints by using the function __annotations__.)
  • As to (3) - Having to do all those checks can become tedious (or lead to significant slow-downs). It again depends on how/where this code is going to be used, what kind of guarantees you want to make - what kind of other error handling you have. (There is also sth to be said for “Just let it crash!” since it’s virtually impossible to catch every possible kind of error and handle them in a sensible way.)
  • As to (4) - There is no general answer possible as to whether it’s better to use default values or not. That really depends on the kind of function and how it’s supposed to be used. Just never ever use container objects like lists or dicts as defaults since that can lead to very unexpected behavior. In this example, if you want a default, you could use sth like this:
def fun(lis: list[str] | str | None = None):
      if lis is None:
          lis = []
      elif isinstance(lis, str):
          lis = [lis]
      if not isinstance(lis, list):
          raise TypeError(.. some message...)
      if any(not isinstance(x, str) for x in lis):
          raise TypeError(... some message...)
     # ... more code ...
2 Likes

As to “best practice” in using default values or not (and if so, which ones) - It might help to look for instance at how a well-designed mature library like pandas does this. See, for instance:
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.info.html

The only kind of general principles that I can think of here are:

  • never use container objects as default values;
  • if a container type needs to be supported, use None as default;
    and perhaps also
  • if different types of an argument are supported (for instance bool and int and str), just use None as default
1 Like

Btw - there are libraries available that do support using type hints at runtime for type checking.
I’ve never used any of those, so don’t know how mature/stable/useful they are and what kind of runtime overhead they come with. Also don’t know of any major packages that are using them.

1 Like