Displaying parsing errors

In the ideal world, how would we display parsing errors?

This is a follow up on: Truncating SyntaxError - #6 by storchaka and I’m trying to figure out what a protocol needs to support in order to display the error messages the way we want.

Some notes:

  1. Displaying bytes in hex is not helpful and is more likely to confuse the user

    expand
      File "file.txt", position 3
        63 61 66 e9
                 ^^
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9: ordinal not in range(128)
    
  2. A global offset doesn’t work very well, because newlines can’t be displayed properly in a terminal

    expand

    The current error message even mentions the line and column!

    re.PatternError: unterminated character set at position 4 (line 2, column 1)
    

    Not replaced:

      File "<string>", position 4
        foo
    [
            ^
    re.PatternError: unterminated character set
    

    Replaced:

      File "<string>", position 4
        foo�[
            ^
    re.PatternError: unterminated character set
    
  3. For online resources, Url "https://link.to/file" is probably better than File "<string>".

  File "file.py", line 1               |   File "file.py", line 1, column 2
    [,]                                |     [,]
     ^                                 |      ^
SyntaxError: invalid syntax            | SyntaxError: invalid syntax
---------------------------------------+---------------------------------------
                                       |   File "file.json", line 1, column 2
                                       |     [,]
                                       |      ^
json.decoder.JSONDecodeError: Expectin | json.JSONDecodeError: Expecting value
g value: line 1 column 2 (char 1)      |
---------------------------------------+---------------------------------------
                                       |   File "<string>", line 1, column 4-5
                                       |     café
                                       |        ^
UnicodeEncodeError: 'ascii' codec can' | UnicodeEncodeError: 'ascii' codec can'
t encode character '\xe9' in position  | t encode character '\xe9: ordinal not 
3: ordinal not in range(128)           | in range(128)
---------------------------------------+---------------------------------------
                                       |   File "file.txt", line 1, column 4-5
                                       |     caf�
                                       |        ^
UnicodeDecodeError: 'ascii' codec can' | UnicodeDecodeError: 'ascii' codec can'
t decode byte 0xe9 in position 3: ordi | t decode byte 0xe9: ordinal not in ran
nal not in range(128)                  | ge(128)
                                       |   File "<string>", line 1, column 1-1
                                       |     +123
                                       |     ^
re.PatternError: nothing to repeat at  | re.PatternError: nothing to repeat
position 0                             |
---------------------------------------+---------------------------------------
                                       |   File "file.xml", line 1, column 8-11
                                       |     <foo></bar>
                                       |            ^^^
xml.parsers.expat.ExpatError: mismatch | xml.parsers.expat.ExpatError: mismatch
ed tag: line 1, column 7               | ed tag
---------------------------------------+---------------------------------------
                                       |   File "file.csv", line 1, column 6
                                       |     "foo" "bar"
                                       |          ^
_csv.Error: ',' expected after '"'     | csv.Error: ',' expected after '"'

Binary parsing errors could still be displayed like this:

  File "file.txt", position 3
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9: ordinal not in range(128)