Looking for feedback on adding import autocomplete to PyREPL

brettcannon · February 27, 2025, 10:30pm

It depends what path hooks are installed for those directories in submodule_search_locations (i.e. the meta path finders). If they are standard, then very reliable, but it they are custom code then all bets are off.

patrick-kidger · February 27, 2025, 11:12pm

I’d offer a -1 on this idea. I’ve actually gone down this road before: as part of implementing a custom debugger, including an embedded REPL that offered autocomplete.

Originally I had this kind of auto-import autocomplete turned on, and it played merry hell.

Importing heavy modules like torch is an example some have touched on above.
Plenty of modules register excepthooks, monkey-patch third-party libraries, etc. on import. Clearly questionable design on their part to modify global variables on import like that, but we can’t control that.
Exceptions during import are unsafe to silently ignore. It’s pretty common to have one module transitively import another third party module, which fails for whatever reason, and now sys.modules has a broken module silently sitting around in it.
Filtering to just stdlib doesn’t work because these may be shadowed by user modules. (mylib/random.py)
I have a folder full of one-off files called tmp1.py, tmp2.py, … and an erroneous from tmp4 import <TAB> – when I really meant to type tmp3.py – runs arbitrary code. That is, even the from X import <TAB> case wasn’t sufficiently safe in my case.
As my particular use-case was a debugger, checking whether particular modules were even imported at all was often important for inspecting the state of my system.

These are all just particular examples of the obvious ‘imports can run arbitrary code’ problem. I’m writing this to add evidence behind the claim that this problem is a serious one, and at least for me was serious enough that I backed away from this idea.

Beyond the above, note that even from something_already_imported import foo<TAB> is unsafe since that may trigger an importlib.util.LazyLoader, or invoke a module-level __getattr__.

I think the only way to do this that is robust enough is static analysis. Can we use an LSP?

tomasr8 · February 28, 2025, 10:33am

But those hooks would interfere with the normal import mechanism as well, right? So IIUC, it wouldn’t be any less reliable than trying to directly import the module?

tomasr8 · February 28, 2025, 10:45am

Thanks everyone for your input, it gave me a lot to think about! I’d like to summarize the options that have been mentioned in this thread so far.

Let me just preface that, assuming I’ll be able to use importlib to find submodules,
the import question is only about the case from foo import bar<tab>. All other forms
can be done purely statically without importing.

That is, in these cases import is not needed:

import foo<tab>
import foo.bar<tab>

from foo<tab>
from foo.bar<tab>

This case currently imports the module foo:

from foo import bar<tab>

This is because in addition to modules, we’re also looking for any
identifiers exported by foo starting with bar.

Here are the options that were mentioned:

Allow imports
- pros: works for both Python and C modules and finds both modules and module attributes
- cons:
  - security implications
  - some packages might take long to import which hurts UX
Allow imports but ask for a confirmation (e.g. first <tab> shows a message, second <tab> imports)
- cons: might not be enough to mitigate the security risk
Don’t allow imports
- pros:
  - no risk from importing modules
  - faster than importing
- cons:
  - only module names can be completed, not attributes,
    e.g., from pathlib import Pa<tab> would not work
Don’t allow imports but use static analysis (i.e. ast.parse) to find attributes.
- pros: a good middle ground between usability and security
- cons:
  - extension modules not supported (for attribute completions)
  - may be less accurate

MegaIng · February 28, 2025, 11:00am

I want to point out that e.g. os.path will not be findable with such a system. This is probably even true if os is already imported, you probably should manually check sys.modules to find such fake submodules. Not sure how common this issue is in general, but IIRC the stdlib alone has three such situations (os.path, collections.abc, typing.io)

encukou · February 28, 2025, 11:01am

Perhaps lean into this being a heavier and less secure operation, and bind it to another key? Something like [press F8 to import 'foo']?

The REPL probably shouldn’t assume the prompt and cursor position are intact after an import. After an import, I’d expect to see the prompt repeated on a new line, which would be unexpected for tab.

brettcannon · March 1, 2025, 6:09pm

I wouldn’t classify them as “interfering”. They may be doing something like letting someone import code from a SQLite database, for instance. That’s not interfering nor preventing an import from working, it just means that either actually importing the module or having some API to give you the names are the only reliable, complete ways to do it across all possibilities.

malemburg · March 1, 2025, 10:34pm

I don’t think randomly importing things from the file system as a result of the REPL trying to guess the right module/package is a good idea.

However, you don’t actually have to import modules/packages in order to find them. By punting on the dynamic import features that exist, you can restrict the autocomplete to simply using the importlib finders and perhaps doing some extra file system searches for modules in directories.

This won’t be a perfect solution, but at least it avoids the potential security risks of running arbitrary code as result of importing modules during autocomplete.

If you further limit this to just the stdlib, you can even use a static list of modules/packages to completely do without file system operations. Searching for the modules would be a matter of searching in a list of stdlib modules.

And you can take this idea further by offering tooling to let users seed a cache of available modules - outside the REPL and using more secure way of discovering modules (e.g. by looking in dirs in site-packages, pyproject.toml files, etc.).

pf_moore · March 1, 2025, 11:19pm

There’s sys.builtin_module_names and sys.stdlib_module_names which provide this for you.

tomasr8 · March 2, 2025, 12:38pm

We could special-case these for the stdlib and just not support this for other packages.

This is already the case for modules, just not for module attributes

tomasr8 · March 2, 2025, 12:49pm

Thanks everyone again for your input! Many people have expressed concerns about PyREPL doing sneaky imports while you type and I think that’s definitely a valid concern.

On the other hand, it seems that most people would find (some sort of) autocomplete useful. So in order to move this proposal forward, I am going to remove the support for module attribute completions from the PR. This should allow me to rely 100% on importlib finders without needing to use import directly.

Even without attribute completions, I think the autocomplete will be quite capable and once there is consensus on what to do for module attributes we can simply extend the autocomplete functionality without changing the existing behaviour.

I’ll post here once I update the PR so you can try it out

dg-pb · March 2, 2025, 1:08pm

ipython autocompletion has been doing this for a long time and as far as I know most of people are fine with it.

However, it is understandable that people would like Python REPL to be less “magic”.

Some ideas:

Have env var for REPL option flags (comma separated or something similar), where auto-import is disabled by default.
If module is already imported, then it does lookup, otherwise it doesn’t
Some more involved action is needed for module attribute autocompletion. E.g. TAB x 2

avylove · March 2, 2025, 10:02pm

That definitely seems like a good way to move forward.

Thinking of attributes import autocomplete as a future possibility, perhaps we can think about providing this as packaging metadata rather than using live analysis. Essentially, using some sort of analysis to build a package object map at package build time that can be easily examined by the REPL and editors. Ideally there would be a way to customize the result to account for cases that may not get caught. This would get rid of security concerns, reduce processing, and better support corner cases. Of course, it would not work in cases where package metadata has not been generated, but I feel that’s a fair trade-off.

steve.dower · March 3, 2025, 12:44pm

There’s already an option to disable all the magic ($env:PYTHON_BASIC_REPL). I don’t really see a need to be more precise than that.

pf_moore · March 3, 2025, 1:36pm

As far as I can see (I checked the 3.14 Setup and Usage and library docs) there’s still no documentation for the new REPL, so this isn’t particularly discoverable yet.

I agree that it’s not worth having every feature have its own env variable, but equally, I can’t fault people for asking for more control when it’s not at all clear what they have right now…

Lucas_Malor · March 3, 2025, 7:40pm

I like this option.

tomasr8 · March 10, 2025, 3:30pm

PR has been updated.

Modules are no longer imported. Modules are discovered using a combination of pkgutil.iter_modules and submodule_search_locations.
Removed support for module attributes. This relied on actually importing the modules. We can add this once we have decided in which way we should discover module attributes.

Feedback welcome!

tomasr8 · May 10, 2025, 4:47pm

A bit delayed update, but this is now in main and the 3.14 beta so feel free to try it out and let me know if you have some feedback!

tomasr8 · May 23, 2025, 12:00am

PyConUS sprints update: a couple of new contributors helped improve the autocomplete even further :

PyREPL: autocomplete built-in modules · Issue #134235 · python/cpython · GitHub
PyREPL: Do not show underscored modules by default during autocompletion · Issue #134215 · python/cpython · GitHub

patrick-kidger · May 26, 2025, 9:09pm

By the way, is there a way to access autocomplete from user code? Essentially looking for a str -> Iterable[str] function.

I’m currently using Jedi for this purpose but it has a few sharp edge cases that need working around. If import-free autocomplete is going to be stdlib then I’m hoping that will be a safer choice.