Better PyREPL history file format?

New REPL uses same old plain-text format for storing history entries. But basically it has only one positive pros: it’s simple. Cons: it’s simple! It can’t keep anything else, but commands, no any other meta-information (e.g. timestamps).

One example of it’s current limitations is support for multiple repl, running concurrently. When all repl will be successfully finished, you end with a history from the last repl you quit. Everything else will be lost.

This is something that could be mitigated by using separate PYTHON_HISTORY environment variables, pointing to individual (per running interpreter) files. Such history files then could be merged, though I can’t imagine anything better than just plain concatenation of all commands (well, perhaps you could find a common preamble for all saved command histories) to the single file. I doubt someone did something like this.

The IPython REPL is much better there. Here is an example (repl 1 and 2 running concurrently with repl 1 opened first):

$ ipython
In [1]: # repl 1
In [2]: 111
Out[2]: 111
In [3]: 333
Out[3]: 333
In [4]: %history
# repl 1
111
333
%history
In [5]: # quit repl 1
In [6]:                                            
$ ipython
In [1]: # repl 2
In [2]: 222
Out[2]: 222
In [3]: 444
Out[3]: 444
In [4]: %history
# repl 2
222
444
%history
In [5]: # quit repl 2
In [6]:                                           
$ ipython
In [1]: # repl 3
In [2]: %history -l 11
# repl 1
111
333
%history
# quit repl 1
# repl 2
222
444
%history
# quit repl 2
# repl 3

Another issue with the current repl is that it’s history lost if the interpreter crashed, because history file updated only at the end of session. This is something that never happened with IPython: statements are saved immediately before execution and are available for all repl, which you will run later.

We can extend old plain-text format, e.g. like the bash does (put timestamps in comments). This (with some care) will allow to automate history merging.

Though, why not adopt IPython approach (maybe even re-use it’s code for history support)? The stdlib has sqlite3 module (plain-text format support could be also kept for compatibility and/or as a fallback), and we can provide a cli interface to work with history file by plain-text tools. (I wonder how common something like grep <foo> ~/.python_history for people?)

Is that a sane feature proposal or too complex for the default CPython shell?

4 Likes

I’m cautiously supportive of these ideas. My main question is how many user tickets or forum questions there have been around these issues? It would seem like a shame to implement significant additional complexity for relatively little benefit.

In particular, I’m not sure how important being more graceful on crashes is, as crashes of the interpreter are relatively rare. If crashes are considered unimportant to handle better, then I’m not sure a change in file format is justified.

Here’s a scenario which I think is worth considering:
If I accidentally enter sensitive information in a REPL, how do I remediate it?
e.g.

>>> # a long and productive REPL session
...
>>> # accidental copy paste
>>> x = "the quick brown medical record number: 987654"
>>> quit
# now what?

With the current flat history, you can edit or delete it. If the format changes to sqlite, editing becomes much harder. For many users, deleting it would be their only option in such a case.

1 Like

Good question. I suspect not too much, but the reason is: the plain CPython shell is less popular for day-to-day work, when there are enhanced, developed and stable alternatives like the IPython. Perhaps, the new REPL might change this someday.

BTW, I’m not sure that required changes are significant (c.f. support for syntax highlighting).

That depends on your workflow, but in general — yes, I think you are right. That’s why mentioned issue was a secondary example of limitations in the history handling of the PyREPL.

Edit: and it’s possible to solve this with plain-text history file format.

This one is seems easy: you can remove entries from the SQL tables just like you can remove line from the plain-text file. The readline module has interfaces to remove/replace history items, we can provide same API in the new REPL. I’m not sure how popular this scenario is — the IPython %history magic has no simple option to do that (though, it’s possible).

1 Like

It’s considerably more complicated, if only because editing plain-text is dead easy.

That said, “nuke your whole history file” is also an acceptable solution to that problem, IMO.

Saving the history list at exit is a good thing IMO, because it makes possible to work with the history list in memory. It’s a pity that the REPL currently doesn’t do that. It is just appending the last line even if it is a duplicate. I would prefer if only the last occurence was saved. A way to limit the history size would be appreciated too. I find anything above say 25 distinct lines not really useful. However none of these nice to have features is depending on the history file format.

Maybe related. I’ve long had code (based on the example in the docs) in my $PYTHONSTARTUP file to put the history in a different location.

I was debugging a different problem with the CPython test suite and it led me to a bug in my startup file. After fixing that, I also noticed that Python wasn’t actually saving the history correctly, at least AFAICT, because I couldn’t just scroll up and back after restarting the interactive interpreter. I ended up just deciding that it didn’t matter where my history file is saved, so I just deleted the code and moved on, but I wonder if it was related to the new repl.

Point being, whatever we do with the new repl’s history, let’s make sure we keep in mind that folks do want to store their repl history elsewhere, and perhaps make it easier by adding something like a $PYTHONREPLHISTORYFILE environment variable, rather than requiring users to write some obscure readline code.

This is already the case, Barry.

Awesome! It probably makes sense then to update the readline example to something more useful, or at least add a note that this is no longer necessary to get the stated functionality, and is just an example.

1 Like