__init__.py, PEP 420, and iter_modules confusion

Ever since PEP 420 has been implemented, the web offers plethora of advice along the lines: “Since python 3.3 you no longer need the __init__.py to create a package…”, with or without referencing namespace packages and PEP 420.

The problematic scenario is as follows: someone creates a package in site-packages without the said __init__.py file. They test it using a direct import and it works just as the “web” said. Then later someone else tries to import the same module programmatically using pkgutil.iter_modules() and the code no longer works, because iter_modules() does actually require the __init__.py file.

I have been asked to explain this, and the truth is I can’t. I trivialize the answer to: RTFM, the Python.org doc literally states “The __init__.py files are required to make Python treat directories containing the file as packages.”. Ignore the stack exchange stuff, it is plain wrong, the __init__.py file is needed and that is that.
My answer may be a copout, but I don’t know what else to say. Am I missing something, or did the namespace package concept introduce this contradiction?

Please read my article.
Don’t omit init.py - DEV Community :woman_technologist::man_technologist:

1 Like

Well, your article says what I already know, namely that the __init__.py file is needed, confirming that most of the web advice is wrong.

I am looking more for a discussion of how PEP 420 actually caused or contributed to this problem. It allows any plain directory to be treated as a [namespace] package. It essentially indirectly promotes creation of namespace packages in place of regular packages (the former looks simpler). I was quite surprised that the PEP would not explicitly address the most obvious problem with eliminating the __init__.py. In my mind I always treated the __init__.py file as the mechanism creating a package from a plain directory and here it is being removed. Hence my deep suspicion that I may have been missing something…

FWIW, I had proposed adding note in top of the PEP 420.
See Issue 36892: "Modules" section in Tutorial contains incorrect description about __init__.py - Python tracker

1 Like

I think a note at the top of the PEP 420 would be quite useful.
Personally, I think that PEP 420 was a mistake – adding an incompatible and obscure feature that in turn produced numerous wrong tutorials, misleading countless programmers. Not something I would expect in what I always considered a very clean language.

I don’t think PEP 420 was mistake.
Mistake is that people promoting using namespace package for regular package.

At least, official tutorial clearly states:

The __init__.py files are required to make Python treat directories containing the file as packages. This prevents directories with a common name, such as string , unintentionally hiding valid modules that occur later on the module search path.

6. Modules — Python 3.9.6 documentation

2 Likes

I don’t believe people understood PEP 420 and its consequences. The problems were already clear nine years ago when people were asking about the purpose of the PEP. Consider this random reddit thread:

I’ll digest it here:
Q: Can someone summarize what this means?
A: Make a package, just by making a subdirectory (of a directory on sys.path)
Comment: "Cool, another ugly __ thing __ removed."

As you can see, the explanation is strongly misleading. What is being created is not just “a package”. What is being created is a “very special package” that breaks some of the very established paradigms, most notably as you quoted above, the __init__.py file being the indicator that a directory is a package. Unfortunately, most people just walked away thinking that the hard-to-read PEP 420 is just doing away with the __init__.py file.

If people actually read the PEP 420 carefully they would have noted the very first two statements in the PEP 420 specification:

Regular packages will continue to have an __init__.py and will reside in a single directory.
Namespace packages cannot contain an __init__.py.

And, like me, they could have asked “If the __init__.py file is needed to distinguish between directories that are packages and those that are not (something that has always been the case and something that tools like pkgutil rely on), then how can the PEP be proposing that the init file be optional?”

That is really a rhetorical question pointing out that the proposal will break existing stuff. And that is why I call it a mistake. I would even go as far as speculating that the reason PEP 420 was accepted had more to do with the misunderstood belief that it will simplify packaging than because of the esoteric feature it introduced.