Built-in is_main() function as a more beginner-friendly alternative to if name == "main"

vovavili · August 29, 2022, 9:47pm

Hello all!

I was wondering if there could be some sort of alternative to indicating that a script is meant to be run directly that does not force beginners to wrap their head around magic methods and how Python works under the hood. Purely from cursory overview of beginner-to-intermediate educational content available for free, the idea of using magic methods for this specific purpose does seem to be the biggest source of confusion for people new to Python - video explaining what this bit of code does by Corey Schafer is sitting on 1,7 million views (one of his best), and video explaining the same concept by mCoding is at 865 thousand views (his most viewed video ever). Question asking what this idiom does on StackOverflow has more than 7600 upvotes, and more than 3200 bookmarks (second most upvoted Python question). Furthermore, this fits in line with introduction of dataclasses, which is basically a way of declaring classes without writing boilerplate magic methods.

How about instead of:

if __name__ == "__main__":
    main()

Python gets this slightly more readable alternative that does the exact same thing:

if is_main():
    main()

What are your thoughts about this proposal?

pf_moore · August 29, 2022, 9:55pm

Why not simply call main() with no conditional? Or just write the contents of main() at the top level and skip having a main() function at all?

You only need the if __name__ == "__main__" check if you want to have a script that can be run as a script or imported as a module. And honestly, that’s not a particularly common need until you’re probably perfectly capable of understanding what it does.

Although I concede that people are taught to use the if __name__ check right from the start, and that is problematic, as you say, because it involves concepts they don’t know yet. But that’s a problem with how Python is being taught, not with the language itself.

vovavili · August 29, 2022, 10:16pm

@pf_moore Good point. So you don’t think that Python should evolve in a way that increases its pedagogic potential when, as it stands, in many cases the pedagogical material out there is flawed? It’s one thing to say “many teaching code snippets out there are flawed, c’est la vie”, but I feel like the more constructive approach would be “many teaching code snippets out there are flawed, but an insubstantial addition to Python can make the pedagogical flow more natural to beginners in spite of that unfortunate situation, as educators wouldn’t find themselves in a situation where they are inadvertently teaching magic methods to people before it is appropriate”.

guido · August 29, 2022, 10:31pm

Given the simplicity of Paul’s suggestion (“just call main()”) I don’t see that a new built-in (which is a maintenance and documentation burden for generations to come) would be a good addition to the language.

Rosuav · August 30, 2022, 1:41am

Beginners should just assume that a script is meant to be run directly. Importing modules can come later. And if there’s an importable module with no “name is main” block and you run it, chances are it’s going to quietly do lots of nothing.

IMO the problem is beginner courses that throw unnecessary boilerplate at their students, not the exact nature of that boilerplate. Try starting a brand new Java project. Now start a brand new C# project. Do you actually learn what the boilerplate means, or do you just look for something to copy and paste? (And that’s true for non-beginners too, in many many cases.)

aroberge · August 30, 2022, 2:08am

I think that having only one way to do something as basic as this is plenty. And, by now, there are so many resources that mention it that it would likely be counterproductive to introduce a different way to express the same thing.

If a second way was to be added, I would much prefer to see this written as follows:

if not __imported__:   # or if not_imported:
   ...

which I see as having a much clearer meaning.

And, regarding the ideas of having a single main() special builtin, in some scripts, I’ve had to write

if __name__ == "__main__":
   ...

twice (once at the top for special import, and once more at the bottom). In this case, the idea of a main() function/builtin would not be so useful.

steven.daprano · August 30, 2022, 2:08am

All newcomers to Python are being taught this? A majority? A minority? Hardly any of them?

Who is teaching this? Does the Python tutorial teach this?

These are not really rhetorical questions. Especially not the one about the tutorial.

My point is that we should not conceed the assumption that newcomers to Python are being taught this idiom. Perhaps some are, but many are not.

Nor should we conceed the assumption that this is a bad thing. I think that, as experienced programmers, we may sometimes forget that it is pretty much impossible to teach somebody a first language without introducing them to concepts that they don’t know yet.

We should remember that beginners will be introduced to many concepts before they understand them. If they already understood them, they wouldn’t need to be taught! Basics like:

what’s a function?
what’s a class? (this one broke my brain for the longest time!)
what’s a variable?
why do some words have quotes around them and others don’t?
how do you start the interpreter or run a script?
what’s the difference between the OS shell and the Python REPL?

(Not that they even know the terms “shell” and “REPL”.)

It is my experience that when teaching programming to absolute beginners there will always be the need to give at least some minimal boilerplate that some students won’t understand (yet!) and have to take on faith, “That’s just what you do to make it work”. This is part of the learning process.

steven.daprano · August 30, 2022, 2:33am

The standard idiom looks a little intimidating because of the underscores, but it really isn’t complicated. We have a global variable (not a special method) called “name” (plus some underscores) that tells you the name of your module.

If that name is “main” (plus some underscores) that means you are running the module as a script.

What makes this complicated for beginners is that they don’t always understand the difference between modules which are used as scripts and modules which are used as importable libraries. That’s a moderately advanced concept.

The first edition of “Learning Python” by Lutz and Ascher doesn’t introduce the __name__ == '__main__' trick until page 138, chapter 5, where it is talking about modules and the difference between scripts and libraries. That seems like the right place to me.

Modules which are intended to be used only as scripts don’t need this at all, and shouldn’t have it. Neither do library modules which are not intended to be run as scripts. Only those which are intended to be used as both need this.

vovavili · August 30, 2022, 2:56am

I think this is a brilliant suggestion as well, if Python developers are reluctant to add any built-in functions. Basically, keep the current way of doing things with magic methods as is, but make the control flow speak for itself instead of pushing beginners to dig around reference manuals to understand the meaning of an idiom.

Do others agree, or do you think that this needlessly violates the “there should be one-- and preferably only one --obvious way to do it” principle (emphasis on “only one”, not on “obvious”)?

CAM-Gerlach · August 30, 2022, 6:01am

I’ve had to grapple with explaining this concept, and consequently if __name__ == "__main__", to a number of students that I mentor/tutor recently; after they have the very basics mastered, this often comes in handy at the stage where I am showing them how to structure their code into functions, and, then, import them as modules in other scripts, but before they’ve fully decoupled their module and script code (which can often involve considerable refactoring, especially since they are typically learning this while working on real-world research projects, with Python being primarily a tool rather than the main objective).

As such, I can to some extent sympathize with the non-obviousness of why this check works and what it is actually doing. In hindsight, perhaps this could have been made simpler. For your teaching, one could perhaps offer a support module/package with helpers abstracting such lower-level operations until one’s students are ready to learn them, which could provide this function.

However, I believe initially I was trying to explain in too much detail how and why exactly the check works as it does, a temptation which I sometimes succumb to, as opposed to abstracting as simply “this block will run when the file is executed as a script rather than a module” and deferring a detailed explanation of the “how” until later, and focusing on what is both the much more important, necessary and interesting concept, the fact that Python files can be executed as both scripts and modules, and the difference between libraries and applications that this is linked with.

We don’t want to encourage mindlessly copying boilerplate and cargo cult programming, but I believe that might be best served at the macro level by explaining why you’d add such a check, rather than how it works, and perhaps including a comment mentioning such in beginner-focused materials.

Sidenote, but this is something I’ve seen beginners constantly struggle with; first understanding the “why”, and intuitively grasping in which contexts string literals versus identifiers are required—it can be hard to remember despite being a beginner myself only a few years ago, but it takes a lot of time and careful attention to internalize this, especially when it comes to spotting and avoiding constant bugs in code caused by this.

encukou · August 30, 2022, 8:48am

The standard idiom is hard to teach to someone just learning about importing, global variables becoming module attributes, and avoiding import-time side effects. “Pattern-matching“ students will get confused by where the quotes go, analytical minds will be surprised by an auto-assigned global variable that works differently from globals. It’s a lot of new stuff at once.

So in my own courses, I tell students to make separate importable modules and executable scripts. It’s easy to explain, and IMO it’s also best practice for bigger projects. (If you import the main module under another name, you get 2 independent modules with the same content, which is almost never what you want. A separate two-liner import+call module avoids this, as no one will want to import from that.) And after you get comfortable with imports, if __name__ == "__main__" is much easier to explain.
(For more advanced students, you can throw in a discussion on how reliance on accidental implementation details makes hard to fix issues – in this case the python -m foo can’t name the executed module foo, making it easy to double-import. Using an introspection-related variable to see how a module was imported is actually a very … creative hack.)

If a second way is added, I’d suggest avoiding if entirely and using def __main__(): – a function that’s called when a script is executed, similarly to how __main__.py is the entry point in a package.

steven.daprano · August 30, 2022, 10:14am

(Let me say up front this is not a suggestion for a change to Python.)

In the early 1990s, Apple’s Hypertalk scripting language treated quotation marks as optional when possible. If your string was a single word, and didn’t match a variable, you could leave the quotation marks off: put Hello into greeting or let greeting = Hello.

Another way of thinking about it is that if you had something which was syntactically a variable, like Hello, and it was undefined, it evaluated to the string “Hello”.

It also worked with object identifiers. send mouseDown to button Cancel would send the mouseDown event to the button called “Cancel”. This worked really well, until you had a variable called “cancel” that contained the name, or number, of some other button.

pf_moore · August 30, 2022, 10:52am

I think it’s unnecessary. Like many people have said, I never taught beginners to use if __name__ == "__main__" in the first place. I started by teaching people how to write scripts (which don’t need any form of guard). Then, when they were experienced enough to need to factor their code into multiple modules, there’s still no problem as long as they keep “main scripts” and “importable modules” as two separate things in their minds.

The mere idea of writing a file that’s usable both as an importable module, and as a main script, is actually a very rare beast, with subtle implications, and people simply shouldn’t be doing that unless they know what they are doing.

I agree there’s a lot of misinformation around on this matter, and probably also a lot of code that uses the if __name__ idiom unnecessarily, which makes the problem worse, as people blindly copy code without understanding it (a reality we have to accept happens, even if we don’t agree with it). But this doesn’t mean we need to change the language so that people’s misconceptions become true. Rather it means we should educate people better (for example, adding a proper explanation to the Stack Overflow question the OP mentioned, that basically says “you should almost never need to use this, start by checking why you think you need to”).

Of course the real problem is that it’s far easier to say “let’s change the language” than it is to correct the misconceptions of thousands of people posting inaccurate information on the internet That doesn’t mean it’s the right thing to do, though.

Rosuav · August 30, 2022, 11:05am

But, significantly, introducing a completely new way to achieve the same goal won’t fix the blind-code-copy problem - it’ll make it worse, because now there’s two different things people will copy and paste. For instance, if the language is changed so that def __main__(args): gets called automatically, I would confidently say that there’d be people who would end up putting if __name__ == '__main__': __main__(sys.argv) at the ends of their scripts. And then there’d need to be ANOTHER layer of boilerplate that says if sys.version_info < (3,15): guarding that, because that weird double-invocation would actually be correct on older versions of Python, and then that would get mistakenly copied and pasted everywhere too, and so on ad infinitum.

So here we are: usually not needing to check __name__, often checking it unnecessarily, and then moaning that students have to be aware of things they actually could gloss over. Welcome to life as a programmer, folks, it’s not perfect but it’s a lot more fun than some of the alternatives!

vovavili · August 30, 2022, 7:46pm

One fun way I also see similar to your proposal is something like a very telling decorator (though I am not sure how comfortable are beginners with decorators as opposed to magic methods), for example

@main_function
def main():
    ...

This is kind of similar to a more restricted version of @atexit.register, when you think about it.

encukou · August 31, 2022, 10:03am

The shim should be if __name__ == '__main__': exit(__main__(sys.argv))
If that becomes common, we’re fine. It’s easier than the other layer of boilerplate, so IMO it has a chance.

Nice! You can even implement it today, e.g.:

def main_function(func):
    if func.__module__ == "__main__":
        func(sys.argv)
    return func

(There is a trade-off: exit() with the return value vs. allowing code below this to run. I don’t think using atexit is a solution: running the entire script as part of interpreter shutdown sounds very scary.)

The next step is problematic, though. Usually the best way to get code in stdlib is to put it on PyPI first, and see if people like it. Unfortunately, this shares atpublic’s problem: you only need this once per project, and the “old way” is easier than adding a dependency (or even vendoring – copy/pasting – the decorator). So it’s quite useless if it’s not in stdlib.
And if it’s in stdlib, a __main__() that’s called after the module is imported works better (solves the tradeoff above, and makes the function available while it’s called).

vovavili · August 31, 2022, 1:38pm

Okay!

PyPi

GitHub

Source code

I will also try to post this on reddit and HackerNews and see if this idea will gain any traction among ordinary Pythonistas.

Rosuav · August 31, 2022, 2:52pm

Please don’t misrepresent the status quo. Your PyPI description says that the “name is main” idiom indicat[es] that a script is meant to be run directly; in actual fact, it indicates that the script is meant to BOTH be run directly AND be imported as a module. That’s why multiple of us in this thread have said that this isn’t a major problem.

By contributing to the misinformation, you might increase the number of people who use this alternate idiom, but please don’t. It’s not fair to the truth.

vovavili · August 31, 2022, 3:02pm

Thank you for your input! I will fix this in a jiffy.

Traditionally, this idiom indicates that a script is meant to both be run directly and be imported as a module, but its (possibly unjustified) prevalence in educational materials nudges beginners to wrap their head around magic methods and how Python works under the hood before it might be appropriate.

pf_moore · August 31, 2022, 3:12pm

Much better. Maybe also add “This decorator provides an alternative for people who don’t want to simply remove the if __name__ == "__main__" test altogether for some reason.” Because I still believe that it’s better to educate people that the idiom simply isn’t needed rather than offering alternative ways to spell it.

Built-in is_main() function as a more beginner-friendly alternative to if __name__ == "__main__"

Built-in is_main() function as a more beginner-friendly alternative to if name == "main"