Using multi-line string for literate programming, aka Jupyter Notebook as a Python file

Several peoples are looking for a way to encode “Jupyter Notebook” into a text file, for example Markdown or Python.

In the case, we decide to embed the text cell into the Python file. We need to encode the text cell as a comment. But this is a pain to edit with a standard text editor, because we have to prefix each line with #. This is due to the lack of multi-line comment in Python, aka /*...*/ in C.

However, I just realized that it is syntactically correct to insert multi-line strings in a Python source. It just behaves like a ghost string when the multi-line string is placed outside a docstring location.

I would like to have the opinion of the core Python developers about this topic.

Since CPython already supports this syntax, it would just be a matter of polishing…

However, this usage will require a special handling when we convert a Python file to a notebook, because the ghost string output must be discarded.

Some relevant projects:
(note: I cannot insert more than 2 links…)

the following links…

This is something that tools doing AST or text processing of source code should be easily capable of doing. Jupyter notebooks are not really a topic for the CPython core team, nor is this really the correct place for discussions about this.

What exactly would need to be polished in terms of support from the python side of things?

2 Likes

It is only correct to do so where a statement is expected. The following is invalid:

match 3:
    """Matching 3"""
    case 3:
        print("x")

This might be the only counterexample, as I think match is the first compound statement whose body is not simply a list of statements. (case ... itself is part of the syntax, not a standalone statement.)

What you call a “ghost string” is just an expression statement being used in its capacity to provide a doc string. (You can use strings anywhere a statement is expected, because it’s simpler than trying to enumerate where doc strings are used and restrict the grammar to only accept strings in those locations.)

2 Likes

I think that what you’re looking for is not to have things embedded in Python, but to embed Python in something else.

This is the approach taken in the R community with rmarkdown, and the results are quite impressive for a text format which handles like a Jupyter notebook. I believe that they also have extensions to support Python in rmarkdown, but I don’t know much about it.
I would encourage you to look a bit at rmarkdown for some useful prior art.


As pointed out above, multiline strings are just expressions. If you don’t assign them, nothing happens. But they aren’t syntactically special. Any Python literal or expression can be used in this way.