I’m looking for a way to inhibit creation of .pyc files for modules installed in a particular location, while not inhibiting the creation of .pyc files for modules installed anywhere else. Is there a way to accomplish this?
Rationale/Details:
I’m in a networked environment with 1000s of users.
Network has multiple platforms, and requires support a “standard” set of python versions. So I have a set of python virtual environments supporting each combination of python version and platform.
We have a collection of Python-only modules that are intended to be used in all of these virtual environments
There are multiple maintainers for the module source, and deployment needs to be as simple as pushing the updated source files to a common network location. All virtual environments have a .pth file that extends sys.path to find these modules in this common network location.
Because of multiple maintainers, first person (user “A”) to do “import ecommlib” will trigger creation of .pyc files in the common network location, and will own the .pyc files as a result. Next person that pushes changes to ecommlib source will cause a problem for everyone except that first user, because new .pyc files will be needed, but only user “A” can overwrite those .pyc files.
So, I want to avoid creation .pyc files files these common/shared python-only modules, but I don’t want to use the “-B” sledgehammer, and inhibit .pyc creation globally, as that is not necessary, and I don’t want to impact anyone else’s performance with their own code.
I’m surprised the shared Python-only modules aren’t read-only to the 1000s of users already, to prevent malicious tampering from only 0.01% of them bringing the entire organisation to a halt.
Is it problematic to tell the devs with write access not to import directly from it, as doing so will break prod?
The fundamental flaw here is the current architecture is completely dysfunctional, and has no principle of least privilege. Some kind of source repo for the devs, and package index for the users is needed. If they all install local copies from that, they should find it far harder, if not impossible to mess things up for the 999 others.
Thank you for your feedback! I appreciate it very much. However, I think you’ve made a few incorrect assumptions. We do have a source repo for the devs, and the installed python-only module files are read-only. The exposure we have is that the directory structure isn’t read-only. With writeable directories, anyone who can read the files (not just the devs!) will cause the generation of .pyc files when they import the library. That’s all I’m trying to prevent. While it’s a fair criticism that the directories should be read-only as well, pointing that out isn’t really a solution, unless I’ve misunderstood you… ?
Note that I’ve tried to address the directory permission issue before within my team, but there’s a massive amount of infrastructure that assumes directories are writable, and I haven’t been able to get the team ‘herded’ toward a consensus that we need to have a separate unix group for exactly this purpose (or some similar mechanism to limit write access to the prod dirs), and we need to bite the bullet and fix all the naively-implemented infrastructure bits that assume otherwise… But that’s really a different / orthogonal issue.
Not importing directly from prod doesn’t make sense to me… Where else would anybody import from? The module code has to be visible somewhere on a shared network filesystem, where it can be imported-- and wherever that is, python will try to create .pyc files, and so the problem would still exist (just in a different place).
Update: I think I have found a very simple way to address this. When the python-only module is installed in the shared network location that virtual environments can import from, the install script just creates a read-only file named __pycache__ in each directory where python would normally create the cache of .pyc files. This seems to work-- when a python script imports the module, the interpreter does not create any .pyc files, presumably because the attempt fails internally, and that is quietly ignored.
I can’t say I feel this is the most robust solution in the long term, but it’s dead simple, and it does seem to work with every python version I’ve tried so far. Is there a downside to this (other than future-proofing against changes in python that would treat failures to create .pyc files as a hard error)?
You’re welcome Chris. Well done for fixing it! Thanks for generously sharing your solution too. I’ve wondered about that __pycache__ folder.
Sorry about the incorrect assumptions. I appreciate the frustration institutional inertia can cause, even for security concerns. Importing from prod and installing from prod are different. Unless you’re telling me pip install creates .pyc files, installing from prod is fine. That’s what prod’s there for. The dev team can still import from prod too under specific conditions, - i.e. as long as they’re in an environment that cannot or does not create .pyc files. (e.g. with Read only permissions or with PYTHONDONTCREATEBYTECODE set respectively). So not from an environment that needs to deploy wheels to the packaging server (local private ‘PyPi’), (or writes code changes to a repo, if that’s how it’s done).