Using multi-line string for literate programming, aka Jupyter Notebook as a Python file

fabricesalvaire · January 20, 2024, 3:08pm

Several peoples are looking for a way to encode “Jupyter Notebook” into a text file, for example Markdown or Python.

In the case, we decide to embed the text cell into the Python file. We need to encode the text cell as a comment. But this is a pain to edit with a standard text editor, because we have to prefix each line with #. This is due to the lack of multi-line comment in Python, aka /*...*/ in C.

However, I just realized that it is syntactically correct to insert multi-line strings in a Python source. It just behaves like a ghost string when the multi-line string is placed outside a docstring location.

I would like to have the opinion of the core Python developers about this topic.

Since CPython already supports this syntax, it would just be a matter of polishing…

However, this usage will require a special handling when we convert a Python file to a notebook, because the ghost string output must be discarded.

Some relevant projects:
(note: I cannot insert more than 2 links…)

fabricesalvaire · January 20, 2024, 3:09pm

the following links…

executablebooks/jupyter-book: Create beautiful, publication-quality books and documents from computational content.
…
mpastell/Pweave: Pweave is a scientific report generator and a literate programming tool for Python. It can capture the results and plots from data analysis and works well with numpy, scipy and matplotlib.

MegaIng · January 20, 2024, 4:35pm

This is something that tools doing AST or text processing of source code should be easily capable of doing. Jupyter notebooks are not really a topic for the CPython core team, nor is this really the correct place for discussions about this.

What exactly would need to be polished in terms of support from the python side of things?

chepner · January 21, 2024, 6:39pm

It is only correct to do so where a statement is expected. The following is invalid:

match 3:
    """Matching 3"""
    case 3:
        print("x")

This might be the only counterexample, as I think match is the first compound statement whose body is not simply a list of statements. (case ... itself is part of the syntax, not a standalone statement.)

What you call a “ghost string” is just an expression statement being used in its capacity to provide a doc string. (You can use strings anywhere a statement is expected, because it’s simpler than trying to enumerate where doc strings are used and restrict the grammar to only accept strings in those locations.)

sirosen · January 21, 2024, 9:57pm

I think that what you’re looking for is not to have things embedded in Python, but to embed Python in something else.

This is the approach taken in the R community with rmarkdown, and the results are quite impressive for a text format which handles like a Jupyter notebook. I believe that they also have extensions to support Python in rmarkdown, but I don’t know much about it.
I would encourage you to look a bit at rmarkdown for some useful prior art.

As pointed out above, multiline strings are just expressions. If you don’t assign them, nothing happens. But they aren’t syntactically special. Any Python literal or expression can be used in this way.

Topic		Replies	Views
Getting syntax errors in docstrings, how to fix? Python Help	5	286	April 3, 2024
Expression for embeding static, relative-path content Ideas	25	927	June 13, 2022
Embedding a python script into an html file via pyscript Python Help	2	4619	April 16, 2023
Script Search And Replace for multi lines Python Help	5	806	November 28, 2023
Bytes data being converted to string before reaching to encoding's decode method Python Help	7	2393	August 31, 2023

Using multi-line string for literate programming, aka Jupyter Notebook as a Python file

Related Topics