In my own JSON library jsonyx, I use a subclass of SyntaxError to report errors, because they provide more context to the user (the file name and offending line):
json.JSONDecodeError: Expecting value: line 1 column 2 (char 1)
versus
File "<stdin>", line 1, column 2
[,]
^
jsonyx.JSONSyntaxError: Expecting value
This proposal isn’t about changing this behaviour in the json library [1], it’s about providing the tools to implement this in a third party library. Let me explain.
Unlike python files, lines in a JSON file can get very long, making the error hard to read:
File '<stdin>', line 1
{"glossary": {"title": "example glossary", "GlossDiv": {"title": "S", "G
lossList": {"GlossEntry": {"ID": "SGML", "SortAs": "SGML", "GlossTerm": "Sta
ndard Generalized Markup Language", "Acronym": "SGML", "Abbrev": "ISO 8879:1
986", "GlossDef": {"para": "A meta-markup language, used to create markup la
nguages such as DocBook.", "GlossSeeAlso": ["GML", "XML"]}, "GlossSee"}}}}}
^
jsonyx.JSONSyntaxError: Expecting ':' delimiter
So, I truncated the line and adjusted the offset (which is far from trivial):
Show code
def _get_err_context(doc: str, start: int, end: int) -> tuple[int, str, int]:
line_start: int = max(
doc.rfind("\n", 0, start), doc.rfind("\r", 0, start),
) + 1
if match := _match_whitespace(doc, line_start):
line_start = min(match.end(), start)
if match := _match_line_end(doc, start):
line_end: int = match.end()
else:
line_end = start
end = min(line_end, end)
if match := _match_whitespace(doc[::-1], len(doc) - line_end):
line_end = max(end, len(doc) - match.end())
if end == start:
end += 1
max_chars: int = get_terminal_size().columns - 4 # leading spaces
if end == line_end + 1: # newline
max_chars -= 1
text_start: int = max(min(
line_end - max_chars, end - 1 - max_chars // 2,
start - (max_chars + 2) // 3,
), line_start)
text_end: int = min(max(
line_start + max_chars, start + (max_chars + 1) // 2,
end + max_chars // 3,
), line_end)
text: str = doc[text_start:text_end].expandtabs(1)
if text_start > line_start:
text = "..." + text[3:]
if len(text) > max_chars:
end -= len(text) - max_chars
text = (
text[:max_chars // 2 - 1] + "..." + text[2 - (max_chars + 1) // 2:]
)
if text_end < line_end:
text = text[:-3] + "..."
return start - text_start + 1, text, end - text_start + 1
File '<stdin>', line 1
...s such as DocBook.", "GlossSeeAlso": ["GML", "XML"]}, "GlossSee"}}}}}
^
jsonyx.JSONSyntaxError: Expecting ':' delimiter
But now there’s no way to determine what the column number is [2], so I’m requesting a way to truncate the line AND display a column number:
File '<stdin>', line 1, column 371
...s such as DocBook.", "GlossSeeAlso": ["GML", "XML"]}, "GlossSee"}}}}}
^
jsonyx.JSONSyntaxError: Expecting ':' delimiter
Possible solutions, ranging from most to least work for people using this feature:
- Automatically truncate SyntaxError and display the offset
- Add a TruncatedSyntaxError which will be truncated automatically and also displays the offset
- Allow specifying the column number in the constructor of SyntaxError
My implementation already:
- strips whitespace
- reserves 1 character at the end of the line
- expands tabs
- truncates start middle and end, potentially all 3
But some questions still remain:
- min, max and default values for the available number of columns
- what to do with unprintable characters
- how should this be configured
Examples from my unit tests
# ("columns", "doc", "start", "end", "offset", "text", "end_offset")
# Remove leading space
(8, " current", 0, 8, 1, " current", 9),
# ^^^^^^^^ ^^^^^^^^
(8, "\tcurrent", 0, 8, 1, " current", 9),
# ^^^^^^^^^ ^^^^^^^^
(8, " current", 1, 8, 1, "current", 8),
# ^^^^^^^ ^^^^^^^
(8, "\tcurrent", 1, 8, 1, "current", 8),
# ^^^^^^^ ^^^^^^^
# Remove trailing space
(8, "current ", 0, 8, 1, "current ", 9),
# ^^^^^^^^ ^^^^^^^^
(8, "current\t", 0, 8, 1, "current ", 9),
# ^^^^^^^^^ ^^^^^^^^
(8, "current ", 0, 7, 1, "current", 8),
# ^^^^^^^ ^^^^^^^
(8, "current\t", 0, 7, 1, "current", 8),
# ^^^^^^^ ^^^^^^^
# No newline
(9, "start-end", 0, 5, 1, "start-end", 6),
# ^^^^^ ^^^^^
# Newline
(8, "current", 7, 7, 8, "current", 9),
# ^ ^
(8, "current", 7, 8, 8, "current", 9),
# ^ ^
# At least one character
(9, "start-end", 5, 5, 6, "start-end", 7),
# ^ ^
# Expand tabs
(9, "start\tend", 5, 6, 6, "start end", 7),
# ^^ ^
# Truncate start
(6, "start-middle-end", 13, 16, 4, "...end", 7), # line_end
# ^^^ ^^^
(7, "start-middle-end", 16, 17, 7, "...end", 8), # newline
# ^ ^
# Truncate middle
(12, "start-middle-end", 0, 16, 1, "start...-end", 13),
# ^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^
(13, "start-middle-end", 0, 16, 1, "start...e-end", 14),
# ^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^
# Truncate end
(8, "start-middle-end", 0, 5, 1, "start...", 6), # line_start
# ^^^^^ ^^^^^
# Truncate start and end
(7, "start-middle-end", 5, 6, 4, "...-...", 5),
# ^ ^
(8, "start-middle-end", 5, 6, 5, "...t-...", 6),
# ^ ^
(11, "start-middle-end", 7, 11, 5, "...middl...", 9),
# ^^^^ ^^^^
(12, "start-middle-end", 7, 11, 5, "...middle...", 9),
# ^^^^ ^^^^
(13, "start-middle-end", 7, 11, 6, "...-middle...", 10),
# ^^^^ ^^^^
If you want to play around with this you can install my library and then use the jsonyx format
command:
$ pip install --force-reinstall git+https://github.com/nineteendo/jsonyx
$ echo '[,]' | jsonyx format
File "<stdin>", line 1, column 2
[,]
^
jsonyx.JSONSyntaxError: Expecting value