Add decorator for script entry point

Currently when we write a script we use some boilerplate:

if __name__ == '__main__':
    main()

While this boilerplate is small, it is easy to make a mistake in it. I propose that we add an entrypoint decorator that would be used like this:

@entrypoint
def main(): ...

The implementation of the decorator is itself rather simple:

def entrypoint(f: func):
    (f, f())[0] if __name__ == '__main__' else f

This allows a simple, easily teachable entry point for scripts that is less prone to typos (I will not enumerate the number of typos I had while typing the code samples for the boilerplate).

I know that this does not necessarily cover code that runs if __name__ == '__main__' but I think it would be a simple, small improvement that we could easily do.

2 Likes

You can use click for this kind of behavior.

import click

@click.command()
def main():
    ...

With the entry point defined in pyproject.toml, as it should be. No need for if __name__ == "__main__".

I don’t think there needs to be yet another way to do this, nor does this specific way need to exist in the stdlib.

3 Likes

Not every script has a pyproject.toml and it would make teaching simpler because it’s much easier to understand than the boilerplate. That is one of the reasons I am proposing this, to improve teachability.

I made a silly mistake earlier–click is superfluous to the requirements. This just requires an entry point.

I don’t think it does improve teachability, though. It adds another variation. The old way doesn’t go away–it exists all over the place, so people still need to learn it.

There’s already a simple way to make a script execute: don’t use the check at all and just run the code. if __name__... is a way to allow the script to be executed or be imported. But there’s a better way to do that now: build a package and define the entrypoint.

Given that understanding how to write a package is surely an upcoming step for anyone learning python, I just don’t see the motivation.

3 Likes

Another use case is tests included in the module under that if __name__ block, which is less uncommon than might be thought. Especially when teaching.

It’s not MUCH less prone to typos. The teaching value of if __name__ == '__main__': is that a Python script simply runs, it doesn’t have declarations. Everything is executable code, including function definitions. Replacing that with a magical decorator makes it into something special instead of a natural consequence of the way that Python is built. That would not be an improvement.

14 Likes

Yes, I use if __name__ blocks for testing somewhat frequently. Usually with the plan to move the tests into pytest later.

But I almost never use it in the format if __name__=="__main__": my_func(). I like if __name__=="__main__" because it’s nice and explicit. You could in principle put it anywhere in the code.
If I want to use the block for testing, I generally need a few lines for set-up, and then I also need to pass arguments into the function, and then I need to print the results.
I can’t conceive of a way to do any of that more tidily with a decorator.


PS: if you’re teaching, could you get into the habit of not calling your entrypoint main()? If the function needs to be imported anywhere, it should not be called main() because then it should have a meaningful name. If it’s not going to be imported anywhere, ‘just’ replace the boiler plate as for example

imports

def main():
  AAAA
  BBBB
  CCCC
  DDDD

def auxilary_functions(): ...

if __name__=="__main__":
  main()

==>

imports 

def auxilary_functions(): ...

if __name__=="__main__":
  AAAA
  BBBB
  CCCC
  DDDD

I feel like repeating the pattern

if __name__=="__main__":
  main()

causes confusion among beginners. It causes people to imagine parsing rules that don’t exist.

1 Like

The decorator would only work if the main function in defined at the end of the module (or after any functions it calls), which might be confusing. I often put my main function near the start of the script.

6 Likes

Why would you say that? The only thing in the __name__ == '__main__' block would be the decorated function so it shouldn’t have any bearing on where the function is defined or what comes before or after it.

Because the decorator runs the function.

@entrypoint
def main():  # Main is ran now
    foo()  # NameError!

def foo():  # Then foo is defined
    ...

Incidently, this


def entrypoint(f: func):
    (f, f())[0] if __name__ == '__main__' else f


 will fail when entrypoint() is not defined in the __main__ module because __name__ reflects where entrypoint is defined rather than where it decorates a function or where that function is called.

Honestly, I think these are both evidence that this feature is more misleading than any qualms people might have about mistyping "__main__".

10 Likes

-1, because also some Python packages / modules have more than one function that we want to use as an ‘entry point’

# pyproject.toml

[project.scripts]
pyzip   = "PyZipFooBar:compress"
pyunzip = "PyZipFooBar:uncompress"

We’re not limited to a single public static void main ... or even a canonically named main() function, albeit that we often choose main.

I use python -m build to create my packages, which creates the boilerplate for us. #! shebang, if __name__ ..., and (no longer) the PEP 263 # -*- coding: utf-8 -*- thingy before UTF-8 dominated lol.

As for standalone scripts where one writes the shebang and if __name__ == "__main__" oneself, it’s often a noob’s first sight of dunders, a sight of inspiration and a learning opportunity :smiley:

4 Likes

I don’t know if a decorator is a good idea, but I must admit that if __name__ == '__main__': is difficult to explain. It seems like some legacy trait of Python.

Why would the name of the module chosen to run change to '__main__'?

1 Like

Because that’s what was chosen to represent the top level of the module hierarchy. The first file run by the interpreter will be given the name __main__ so you can tell if the file was executed directly or not.

That also means that if you execute a submodule, it will not have access to its parent namespaces unless those are explicitly added to sys.path.

Since you can also import __main__ from any submodule, that means those submodules can check where the top level of the namespace is without having to check actual filenames.

I think they were rather talking about the aspect of teaching this. If some person new to Python made a hello.py, why would the name (of the module, easy to think of the filename) change to __main__ now?

Perhaps some builtin is_main function could help with that, although the name would be something to think about. It should be builtin in my opinion, so no (confusing) imports are required. The patters is very common, why not make it a bit easier for beginners?

The aspect of the __name__ attribute of the global namespace changing to __main__ almost seems like an implementation detail, so wrapping this behavior in a simple function makes sense to me.

2 Likes

That’s a false assumption though. If you think “module name is file name”, yes, other module names WILL be a big surprise, and then everything becomes magical. But they’re not. Module names are in an abstract namespace of modules, file names are in a concrete namespace of paths and files. Not every module exists on the file system.

Start by teaching that modules are more than just files, and then “the main module is named __main__ no matter where it comes from” makes complete sense, as does “to find out if you’re the main module, check to see whether you’re named __main__”. Well, okay, to be fair, it only makes COMPLETE sense if you already know about dunders, but if not, this is a good opportunity to at least lay some groundwork by showing that Python puts two underscores before and after certain special names.

5 Likes

This is a good point to explain the difference between executing a script and importing a module, something that also comes up when you need to explain why a relative import fails, and why an absolute import fails when failing to distinguish between the current working directory and the directory the script lives in.

2 Likes

Since noone has pointed this out: This isn’t actually a functional definition of the decorator (even ignoring the syntax errors), __name__ needs to be taken from the function object. And even then question come up like what does entrypoint(input) do? Sure, it might seem nonsensical but it’s valid syntax and I can imagine if __name__ == '__main__': input()' being used to prevent a console window from closing instantly.

My point is that the semantics of entrypoint aren’t obvious and they definitely wont be easy to explain, whatever they end up being. I don’t think this will reduce confusion for beginners. They will just have to accept it as “magic syntax” which is the exact same as if __name__ == '__main__'.

(I would honestly be in favor of a more radical proposal of just calling a function named main in the primary script, but that has backwards compatibility concerns that I don’t have a good solution for.)

2 Likes

Still not the greatest fan of using a function call. It’s not uncommon to change global state when you execute a file directly:

DEBUG = False

def main():
    if DEBUG:
        print("debugging")
    else:
        print("running normally")

if __name__ == "__main__":
    DEBUG = True
    main()

1 Like

This is what it is used for. It is not an entry point of any kind, even though it resembles entry points in other languages. It is an execution context guard and can be used anywhere in the code. You can use if __name__ == "string" for validation, live testing, and similar purposes. I am not sure why it is taught as an entry point when it is not.

4 Likes

Just for fun, assuming what you actually want is for an @entrypoint-decorated function to run when the entire main module completes (NOT when the entrypoint function is defined) so that it mimics the typical pattern of calling main() in a __name__ == '__main__' guard at the end of a main module, you can save the frame in which the entrypoint function is defined, and have atexit register a wrapper that calls the entrypoint function if the index of the last instruction that ran in the frame happens to be the size of the bytecodes of the frame minus 2 (because a bytecode instruction is always 2 bytes long).

Here’s an implementation that works with CPython:

import sys
import atexit

def entrypoint(func):
    if func.__globals__['__name__'] == '__main__':
        def run_entrypoint_if_main_complete():
            if frame.f_lasti == len(frame.f_code.co_code) - 2:
                func()
        frame = sys._getframe(1)
        atexit.register(run_entrypoint_if_main_complete)
    return func

so that:

@entrypoint
def main():
    print('World')
print('Hello')

outputs:

Hello
World

but:

@entrypoint
def main():
    print('World')
sys.exit()
print('Hello')

outputs nothing because the main module gets aborted midway.

10 Likes