Is it a colorizer bug?

wrong_str = "
123456
123456
"

It is a SyntaxError, but the web page colored 123456 as a part of string.

It seems that many web has the same question. Even if in REPL.

The IDLE has no this question, but when a python member try to use tokenizer to color the code in IDLE instead of re, the question appear.

This is the effect in this PR#140347

In REPL:

String literals can span multiple lines. One way is using triple-quotes: """...""" or '''...''' . End-of-line characters are automatically included in the string, but it’s possible to prevent this by adding a \ at the end of the line.

Seem that it is not the answer for the title. What it asked is about the colorizer. Now it also colored for multiple lines string included in once-quote which is wrong.

Does it matter how invalid code is colored? Garbage in, garbage out.

1 Like

Of course. Coloring it as a normal code will mislead the learner.

It is not a garbage in, garbage out.

If just an only “invalid code” can make it colored freehand, the IDE won’t work. Before the code finish, is it valid?

The pycharm can rightly colored.

1 Like

Oh I see, I thought you were using colorizer like any other PyPi package. In VS code (after I turn on the Python extensions) I see this:

So wrong_str is highlighted for you, differently to how the correct string literals are highlighted for me.

Here is a code for highlight.js in this condition:

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Highlight.js Python test for quote</title>

<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.11.1/styles/default.min.css">
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.11.1/highlight.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.11.1/languages/python.min.js"></script>

<script>
hljs.highlightAll();
</script>
</head>

<body>
<h2>Highlight.js Python</h2>

<pre><code class="language-python">
# 1. valid string
a = 'abc'
print(a)

# 2. invalid string
b = 'abc
def'
print(b)
</code></pre>

</body>
</html>

Here is the target in the final html:

<span class="hljs-string">'abc def'</span>

It seems that all of the web pages which used highlight.js to color the code have this problem. Now I have reported this issue.

That was hasty.

Post a screen shot of how correct string literals are highlighted, that can be compared to wrong_string.

It has appeared in the history discussion on this post. Both pycharm and vscode give the correct demonstration. Moreover, the github can currect color the wrong_string too.

Make a bug report to highlight.js if you care, this is not something that this website does itself, it uses a very standard external piece of software.

So this thread should be moved to Discourse Feedback, right? At first I was uncertain, but now with the latest posts, it seems obvious that the issue is about the discuss.python.org website itself and its “Discourse” engine, right?

2 Likes

No

  • This isn’t a “bug” in discourse. It’s in hightlight.js
  • Raising it on this website at all isn’t helpful - it’s not going to get anything changed
  • You have to convince the maintainers over there that it’s a bug at all - I disagree, but I don’t care to argue with you, and it doesn’t matter if you convince me or not.
1 Like

Yes, of course I agree that ultimately it very much seems to be an issue with highlight.js. But I still think it’d be more helpful to move this thread to Discourse Feedback. It would have more chances of catching the eyes of the people in charge of discuss.python.org, and they might be able to escalate towards Discourse team and eventually towards highlight.js team.

1 Like

Zero chance of that happening. You can just escalate to highlight.js yourself. These kinds of small highlighting “bugs” for invalid code are not going to be considered serious by most people.

You want three different teams to take this bug serious enough for them to bother other teams. Essentially, you want to waste time of ~a dozen people at least instead of raising a bug report yourself.

I do not care that much about the bug itself, I care about moving posts in the category they fit best. Also I care more about what the original poster, @Locked-chess-official, thinks about my suggestion. You, @MegaIng, made a good point and I acknowledged it.

2 Likes

Now here is my issue reported there.

1 Like

Thanks for the link. Can you host the html snippet somewhere, to just view it in a browser? It makes it really clear, without having to save it locally. I’d add a third example of the triple quoted multi-line string too, personally.

Now this is the pen.

Nice one. In case you missed it, I’ve suggested a fix. I think illegal: '\\n', or illegal: /\n/ just needs to be added here:

1 Like