Looking for feedback on adding import autocomplete to PyREPL

tomasr8 · February 26, 2025, 7:48pm

Recently, I opened a PR adding support for import autocomplete to PyREPL. PyREPL already has autocomplete for attributes but this would allow you to type e.g. import coll<tab> which would complete to import collections.

The autocomplete also works when you use from .. import, e.g., from pathlib import Pa<tab> which would turn into from pathlib import Path.

In the last example, in order for the REPL to know what objects are available, it first imports the module. Tian raised a question, whether importing modules in this way is a good idea as some users might find it surprising and/or unwanted that the REPL sneakily imports a module in order to provide completions.

I wanted to ask for more feedback on this feature, namely, is it ok to implicitly import modules in order to provide completions? The alternative is to never import anything, but then (at least to my understanding) we wouldn’t be able to provide completions in all cases, for instance, when typing from pathlib import Pa<tab>.

I’d like to know your thoughts on this

guido · February 26, 2025, 8:03pm

Seems to make sense, since when the user hits Enter they are actually going to import it. Unless they erase the line and start typing something completely different.

What does help(‘some_module’) do? The same thing would be acceptable IMO.

Rosuav · February 26, 2025, 8:06pm

It imports it. I just tried help("this") and it dumped the Zen onto the console.

I’d agree that this is a nice convenience. +1.

MegaIng · February 26, 2025, 8:40pm

The one thing I would worry about is that imports can sometimes take a long time making autocomplete weirdly unresponsive - this shouldn’t be an issue for any stdlib modules, but can be a bad user experience for a few third party libraries. I am not sure if this a problem to worry about? And I am not sure if there is a sensible solution.

In case someone wants to respond with “just don’t press tab if it’s a slow-to-import” - this is for at least me, and probably many others a habit of just always pressing tab and is not conscious decision, making this a very annoying if this freezes the console for more than a fraction of a second.

(Note that I am assuming the import happens sync and in the main thread resulting in a freeze of the input ability - importing outside the main thread is probably a bad idea, and I don’t think an async import can happen without another thread)

Otherwise I agree that this is a good idea.

zware · February 26, 2025, 8:43pm

Could the implicit import be made explicit by initially offering something along the lines of ‘module not imported, tab again to import’?

tomasr8 · February 26, 2025, 8:47pm

Tian mentioned torch as an example which can apparently take a few seconds to import. Though I’m not sure what to do about that.

tomasr8 · February 26, 2025, 8:50pm

I’d have to check how hard this would be to implement in PyREPL but that could definitely be an option

oscarbenjamin · February 26, 2025, 9:06pm

There is no distinction in Python between a “module” and a “script”. Executing an arbitrary .py file from the current directory can do much worse than take a few seconds.

hroncok · February 26, 2025, 9:13pm

For the record, ipython imports such a module.

In [1]: import sys

In [2]: sys.modules['matplotlib']
---------------------------------------------------------------------------
KeyError...

In [3]: from matplotlib import TAB TAB Ctrl+C

In [3]: sys.modules['matplotlib']
Out[3]: <module 'matplotlib' from '/usr/lib64/python3.13/site-packages/matplotlib/__init__.py'>

guido · February 26, 2025, 9:26pm

Would it make sense to just say that whatever help() does should be okay for tab completion?

Or is there a significant difference between help() and tab completion?

Or is maybe what help() does also not great (the “what if it’s a script” issue applies there too) and this is a good time to reconsider what’s safe?

tomasr8 · February 26, 2025, 9:40pm

For the record, ipython imports such a module.

Indeed, here’s the source if anyone’s curious: ipython/IPython/core/completerlib.py at 926d3851fef9b2e66024d9a6e623577b0b4d2a28 · ipython/ipython · GitHub

pitrou · February 26, 2025, 9:42pm

I don’t think it is.

How does that work? Do you have a hardcoded list of stdlib modules to choose from? If I have Pandas installed and start to type import pa<tab>, would it try importing Pandas under the hood?

MegaIng · February 26, 2025, 9:51pm

I tried to address this: There is a difference between the explicit operation of typing help('some_module') and the more automatic operation of pressing tab after writing from some_module import A. The issue “what is a script” is true for both (i.e. unindented side-effects), but the issue of unresponsiveness is not.

tomasr8 · February 26, 2025, 9:52pm

How does that work? Do you have a hardcoded list of stdlib modules to choose from? If I have Pandas installed and start to type import pa<tab>, would it try importing Pandas under the hood?

I’m using pkgutil.iter_modules which works (I think, I need to double check with submodules) for import pa<tab> and also import pandas.foo<tab>.

it will currently try to import pandas if you type from pandas import <tab> because it’s looking for both submodules and attributes and I don’t think I can get the list of attributes without importing the module.

oscarbenjamin · February 26, 2025, 10:02pm

It goes further though because with tab-complete you may not have typed the name of the module:

from s TAB
from screw_my_system TAB
from screw_my_system import TAB TAB
# system is now screwed.

Here all you typed was s and the tab completer guessed the rest.

MegaIng · February 26, 2025, 10:10pm

Oscar Benjamin:

I goes further though because with tab-complete you may not have typed the name of the module:
from s TAB
from screw_my_system TAB
from screw_my_system import TAB TAB
# system is now screwed.
Here all you typed was s and the tab completer guessed the rest.

I guess I am not in the habit of having screw_my_system modules/scripts laing around and randomly pressing tab till something happens, so I can’t really see this as a realistic issue.

This is the same as complaining about autocomplete on the phone because you just pressed the “next suggested word” button and send the message and are then confused why the message doesn’t make sense. (Notice how you had to press tab three times after screw_my_system got written out).

brettcannon · February 27, 2025, 12:47am

How fancy do you want to be with this, and how much do you want to worry about extension modules? The “fancy” part is I have been wanting to get rid of pkgutil for ages since it doesn’t integrate into the import system very well, but the various iteration functions don’t have an equivalent in the import system. If we could define a way to iterate through loaders for a location then it could be used both to replace pkgutil and find all importable names.

The “worry about extension modules” part is once you have a finder you can get a loader. Once you have a loader you can get Python source, and that gets you an AST which you could walk to gather names (or read __all__ with some effort). But that might be too limiting.

csm10495 · February 27, 2025, 4:53am

The conversation about to import or not sort of reminds me of the sandbox problem. If there was a sandbox to import and just get names and somehow not do the rest it could be helpful here.

Alas that isn’t a thing and isn’t exactly possible.

I kind of figured that we consider the files in site-packages and in PYTHONPATH to be sort of safe. Though it isn’t if someone has some sketchy file with side-effects.

Maybe we just enable auto import for stdlib modules? Honestly I would allow importing other ones too, but that could be a switch defaulted to off that power users could set to true.

Beyond that adding the stuff to press tab twice to import doesn’t sound like it would deter the type of person who would import/run willy-nilly anyways.

gpshead · February 27, 2025, 8:30am

Type checkers, linters, and language servers already do this in a wide variety of different ways (ex: hacky fast regex extractions, AST/CST based extractions, or even symbolic execution). These are ways traditional IDE autocomplete are successful without a LLM. The code is never executed. Meaningful enough information is inferred from it. It isn’t perfect - nor does it need to be.

The easy line to draw for me is not to tab complete the module names, particularly the top level ones.

But once a “from x import Som” statement has been typed OR if x was already in sys.modules via code executed earlier, importing and introspecting it for autocomplete from the repl - as apparently ipython does - feels okay enough. The user was likely to hit enter and execute that import who’s full name they typed already, they now just wanted to avoid typing SomeLongName.

Tab completing module names via hooks into import loading machinery to find but not load the code is semi-doable (as Brett seems to be talking about) but feels like it could get complex.

How much complexity do we want to live with?

Autocompleting that executes a pile of unbounded code has rough UX edges (as my shell can attest to in many completing scenarios as I live edit files that are part of completions and thus enter hellish loops when I hit tab while those are in a broken state)… It suddenly adds an unknowable delay to something the user expects to be fast. It also gets awkward when the code generates output or errors. These add friction to users who aren’t yet ready for the “it’s turtles way down” system design world of completions executing a whole dependency chain of code loaded from potentially slow/remote filesystems. People still use autocompletes up the wazoo regardless because they are so useful.

tomasr8 · February 27, 2025, 8:59am

That would be great. I’ve done some experimenting yesterday and it seems I can ‘manually’ get all the package submodules by using a combination of find_spec() and submodule_search_locations. Do you know how reliable this is? Having a function for this in importlib would be great.

I think ideally the autocomplete should work regardless of whether we have an extension module or a pure Python package. Though, I think that would necessitate importing, at least in the case of looking for attributes.

I already do that (via pkgutil.iter_modules, though only for top-level imports, but it can probably be extended, I’ll try to update the PR today). This means we would ‘only’ need to import when completing module attributes e.g. from x import Som. Typing from x<tab> would not do an import.