This is just to jot down some design ideas before they go off my memory and into oblivion. Please ignore at will.
Base criteria
Bootstrapping Python will depend on nothing but C, and on the Python that can be bootstrapped from C.
PGL
There needs to be a Python Grammar Language (PGL from now on) to bootstrap Python parsing. The main purpose of the PGL is:
- Understandable by humans.
The only requirement for PGL is that:
- It is powerful enough to describe LL(k)
Since small variations of EBNF are good enough for PEG, and PEG is larger than LL and LR (and maybe also CFG), the above is a done deal, so human understandability of something that leads to determinism is the only criteria.
Step 0
Given PGL, we use âanytoolâ to produce a top-down-descent, library-independent parser (call it pglc
) for PGL in Python and or C (preferably Python). That done, we ditch âanytoolâ and carry on maintaining the very small pglc
by hand.
pglc
is now a tool that can read the syntax of Python described in PGL and output whatever the next step of parsing needs. It is a grammar-to-grammar translator, and/or grammar-to-parser compiler. The language for defining the Python language is now fully defined.
The zeroth part of bootstrap is done.
Step 1
At the next step pglc
takes a description of the Python syntax in PGL and produces a top-down-descent parser for Python source as a C-libpython
program.
Steps 2-3
The produced C-libpython
parser can take text written in Python and translate it into an AST with enough #line
information for further stages to report parsing, semantic, and runtime errors accurately.
Step 1-3 of Python parsing is complete.