It depends what path hooks are installed for those directories in submodule_search_locations
(i.e. the meta path finders). If they are standard, then very reliable, but it they are custom code then all bets are off.
Iād offer a -1 on this idea. Iāve actually gone down this road before: as part of implementing a custom debugger, including an embedded REPL that offered autocomplete.
Originally I had this kind of auto-import autocomplete turned on, and it played merry hell.
- Importing heavy modules like
torch
is an example some have touched on above. - Plenty of modules register excepthooks, monkey-patch third-party libraries, etc. on import. Clearly questionable design on their part to modify global variables on import like that, but we canāt control that.
- Exceptions during import are unsafe to silently ignore. Itās pretty common to have one module transitively import another third party module, which fails for whatever reason, and now
sys.modules
has a broken module silently sitting around in it. - Filtering to just stdlib doesnāt work because these may be shadowed by user modules. (
mylib/random.py
) - I have a folder full of one-off files called
tmp1.py
,tmp2.py
, ā¦ and an erroneousfrom tmp4 import <TAB>
ā when I really meant to typetmp3.py
ā runs arbitrary code. That is, even thefrom X import <TAB>
case wasnāt sufficiently safe in my case. - As my particular use-case was a debugger, checking whether particular modules were even imported at all was often important for inspecting the state of my system.
These are all just particular examples of the obvious āimports can run arbitrary codeā problem. Iām writing this to add evidence behind the claim that this problem is a serious one, and at least for me was serious enough that I backed away from this idea.
Beyond the above, note that even from something_already_imported import foo<TAB>
is unsafe since that may trigger an importlib.util.LazyLoader
, or invoke a module-level __getattr__
.
I think the only way to do this that is robust enough is static analysis. Can we use an LSP?
But those hooks would interfere with the normal import mechanism as well, right? So IIUC, it wouldnāt be any less reliable than trying to directly import the module?
Thanks everyone for your input, it gave me a lot to think about! Iād like to summarize the options that have been mentioned in this thread so far.
Let me just preface that, assuming Iāll be able to use importlib to find submodules,
the import question is only about the case from foo import bar<tab>
. All other forms
can be done purely statically without importing.
That is, in these cases import is not needed:
import foo<tab>
import foo.bar<tab>
from foo<tab>
from foo.bar<tab>
This case currently imports the module foo
:
from foo import bar<tab>
This is because in addition to modules, weāre also looking for any
identifiers exported by foo
starting with bar
.
Here are the options that were mentioned:
-
Allow imports
- pros: works for both Python and C modules and finds both modules and module attributes
- cons:
- security implications
- some packages might take long to import which hurts UX
-
Allow imports but ask for a confirmation (e.g. first
<tab>
shows a message, second<tab>
imports)- cons: might not be enough to mitigate the security risk
-
Donāt allow imports
- pros:
- no risk from importing modules
- faster than importing
- cons:
- only module names can be completed, not attributes,
e.g.,from pathlib import Pa<tab>
would not work
- only module names can be completed, not attributes,
- pros:
-
Donāt allow imports but use static analysis (i.e.
ast.parse
) to find attributes.- pros: a good middle ground between usability and security
- cons:
- extension modules not supported (for attribute completions)
- may be less accurate
I want to point out that e.g. os.path
will not be findable with such a system. This is probably even true if os is already imported, you probably should manually check sys.modules
to find such fake submodules. Not sure how common this issue is in general, but IIRC the stdlib alone has three such situations (os.path
, collections.abc
, typing.io
)
Perhaps lean into this being a heavier and less secure operation, and bind it to another key? Something like [press F8 to import 'foo']
?
The REPL probably shouldnāt assume the prompt and cursor position are intact after an import. After an import, Iād expect to see the prompt repeated on a new line, which would be unexpected for tab.
I wouldnāt classify them as āinterferingā. They may be doing something like letting someone import code from a SQLite database, for instance. Thatās not interfering nor preventing an import from working, it just means that either actually importing the module or having some API to give you the names are the only reliable, complete ways to do it across all possibilities.
I donāt think randomly importing things from the file system as a result of the REPL trying to guess the right module/package is a good idea.
However, you donāt actually have to import modules/packages in order to find them. By punting on the dynamic import features that exist, you can restrict the autocomplete to simply using the importlib finders and perhaps doing some extra file system searches for modules in directories.
This wonāt be a perfect solution, but at least it avoids the potential security risks of running arbitrary code as result of importing modules during autocomplete.
If you further limit this to just the stdlib, you can even use a static list of modules/packages to completely do without file system operations. Searching for the modules would be a matter of searching in a list of stdlib modules.
And you can take this idea further by offering tooling to let users seed a cache of available modules - outside the REPL and using more secure way of discovering modules (e.g. by looking in dirs in site-packages, pyproject.toml files, etc.).
Thereās sys.builtin_module_names
and sys.stdlib_module_names
which provide this for you.
We could special-case these for the stdlib and just not support this for other packages.
This is already the case for modules, just not for module attributes
Thanks everyone again for your input! Many people have expressed concerns about PyREPL doing sneaky imports while you type and I think thatās definitely a valid concern.
On the other hand, it seems that most people would find (some sort of) autocomplete useful. So in order to move this proposal forward, I am going to remove the support for module attribute completions from the PR. This should allow me to rely 100% on importlib finders without needing to use import directly.
Even without attribute completions, I think the autocomplete will be quite capable and once there is consensus on what to do for module attributes we can simply extend the autocomplete functionality without changing the existing behaviour.
Iāll post here once I update the PR so you can try it out
ipython
autocompletion has been doing this for a long time and as far as I know most of people are fine with it.
However, it is understandable that people would like Python REPL to be less āmagicā.
Some ideas:
- Have env var for REPL option flags (comma separated or something similar), where auto-import is disabled by default.
- If module is already imported, then it does lookup, otherwise it doesnāt
- Some more involved action is needed for module attribute autocompletion. E.g. TAB x 2
That definitely seems like a good way to move forward.
Thinking of attributes import autocomplete as a future possibility, perhaps we can think about providing this as packaging metadata rather than using live analysis. Essentially, using some sort of analysis to build a package object map at package build time that can be easily examined by the REPL and editors. Ideally there would be a way to customize the result to account for cases that may not get caught. This would get rid of security concerns, reduce processing, and better support corner cases. Of course, it would not work in cases where package metadata has not been generated, but I feel thatās a fair trade-off.
Thereās already an option to disable all the magic ($env:PYTHON_BASIC_REPL
). I donāt really see a need to be more precise than that.
As far as I can see (I checked the 3.14 Setup and Usage and library docs) thereās still no documentation for the new REPL, so this isnāt particularly discoverable yet.
I agree that itās not worth having every feature have its own env variable, but equally, I canāt fault people for asking for more control when itās not at all clear what they have right nowā¦
I like this option.
PR has been updated.
- Modules are no longer imported. Modules are discovered using a combination of
pkgutil.iter_modules
andsubmodule_search_locations
. - Removed support for module attributes. This relied on actually importing the modules. We can add this once we have decided in which way we should discover module attributes.
Feedback welcome!