I assume that a script that is created in /usr/bin or a similar directory (called scripts in the sysconfig installation scheme) from a console_script entry point would never need to import Python modules from /usr/bin itself. A pip-installed script looks like this:
#!/usr/bin/python3
# -*- coding: utf-8 -*-
import re
import sys
from yyy import xxx
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
sys.exit(xxx())
Would it make sense to remove the scriptâs directory from sys.path? E.g. do something like this in each such file:
script_dir = os.path.dirname(__file__)
if script_dir in sys.path:
sys.path.remove(script_dir)
I canât think of any supported scenarios that would require it to work. Typically the files that are going to go into scripts are generated on install based on metadata anyway, so you donât really know whatâs going to be there.
Same here, but then again there isnât a standard as to what the exact code for an entry point should be, so individual installers will need to be updated (although maybe example code in Entry points specification - Python Packaging User Guide would probably be a good thing).
In fact, I do this with my own wrapper scripts, example code:
import sys
sys.path[:] = [ path for path in sys.path if path ]
from cs.fstags import main
sys.exit(main(sys.argv))
Note Iâm stripping empty paths, not a path which just happens to be
where Iâm standing.
This âtrust where Iâm standing (getcwd)â thing in Pythonâs default
sys.path makes me quite unhappy from a security standpoint, and has done
for years. If I want the modules in the working directory, Iâll add
that directory in full to the path explicitly.
When executing a script, the directory of the script is added to sys.path. This generally has nothing to do with the current working directory. Automatically adding the script directory by default is as safe as oneâs search PATH and execution habits permit (e.g. not executing files located in â~/Downloadsâ). Adding the current working directory by default is generally unsafe, but thankfully that doesnât happen when running scripts.
By default, the current working directory is added for â-câ and â-mâ commands and the REPL, since there is no main script in those cases. It gets added as the empty string '', so it varies with whatever the current directory happens to be when an import is executed.
When executing a script, the directory of the script is added to sys.path. This generally has nothing to do with the current working directory. Automatically adding the script directory by default is as safe as oneâs search PATH and execution habits permit (e.g. not executing files located in â~/Downloadsâ). Adding the current working directory by default is generally unsafe, but thankfully that doesnât happen when running scripts.
That is nice to know; Iâve perhaps been letter my interactive testing
mislead me about this. Iâll test that. [âŚ] Ok, testing shows that it
does indeed add the scriptâs directory and not the current directory.
Adding it ahead of everything else is pretty iffy, convenience over
caution IMO. But ok, I can keep this in mind.
By default, the current working directory is added for â-câ and â-mâ commands and the REPL, since there is no main script in those cases. It gets added as the empty string '', so it varies with whatever the current directory happens to be when an import is executed.
And here we part company. I remain against this (with the possible
exception of the REPL, still with misgivings). If I write some shell
script and invoke:
python -m foo ...
it will very much NOT be my desire that the current working directory
magicly get inserted into sys.path - my previously sound shell script
suddenly has a component which can misbehave in a malicious setting.
Such as that of the sysadmin doing some work inside an arbitrary userâs
directory, or inside a malicious software package (generic, not âpython
packageâ). It neednât be a sysadmin; any user standing somewhere
unfortunate gets this misfeature.
It is a security mine waiting to go off.
Python badly needs some switch to say âdo not change sys.path at allâ.
The -s and -S options do not provide this. Maybe it is too late to
change the default Python behaviour here, but I remain convinced that
this is a misfeature, and refer again to the maxim Heuerâs Razor:
If it can't be turned off, it's not a feature. - Karl Heuer
As @hroncok mentioned in the message just above this conversation, this has been discussed at quite some length on previous BPOs, and @vstinner has an open PR for Python 3.11 that will add a -P option that will no longer add the cwd to sys.path, and additionally -c will no longer do so by default (only -m). This is certainly quite welcomed by many (including myself), and should hopefully address most of these concerns.
I consider it to be a reasonable design decision to give a script priority access to importing modules and packages in its directory. That said, Iâm used to this. In Windows, the application directory has priority in SearchPathW(), CreateProcessW(), and, by default, LoadLibraryW(). An exception is made for reserved names of known system DLLs and API sets. I can see doing the same for core parts of the standard library. In fact, thatâs effectively implemented now by freezing critical modules, including _collections_abc, _sitebuiltins, abc, codecs, importlib, os, os.path, io, site, stat, and zipimport.
If âfooâ is a module in the current working directory, then adding this directory to sys.path is required for the import. Where I part ways is with adding "" to sys.path in this case. The working directory should be added as a resolved path when running a -m module or -c command. Only the REPL should add an empty string to sys.path.
If âfooâ is a package installed into site-packages, adding the current working directory to sys.path is not needed, and shouldnât happen.
And Iâd actually argue that -m should not work for running packages that arenât installed (i.e., it shouldnât work for packages in the current working directory). You donât need -m in that case, as python foo works (although it adds the directory âfooâ to sys.path, rather than the directory that contains foo, which Iâd argue is wrong).
The module search gives priority to the current working directory. That was a design choice in PEP 338. Nick Coghlan is the expert on that subject. I donât even use this feature broadly speaking, except for two cases: -m pip and -m venv.
My only qualm is with adding the working directory to sys.path generically as "", which remains for the lifetime of the process, affecting all imports according to whatever the working directory happens to be at the time, which can change any number of times. I think It should add the working directory as a resolved path.
If Iâm running python -m foo I truly donât care whatâs in the current
directory - I want Python to find it in my $PYTHONPATH or the
unrelated-to-the-current-dir default i.e. in an installed place.
If I wanted foo from the current directory I would explicitly add it
to $PYTHONPATH.
In fact, Iâve got a shell alias named dev for precisely this kind of
effect - to run code in the local development environment. So to test
run some dev code my practice is to go:
dev some command here ...
which sets up $PATH, $PYTHONPATH etc suitably to find the development
stuff - the modules here or what have you. Without the dev prefix
command I expect to be uninfluenced by the local dev code, even though
Iâm standing in there.