A language belongs to LL(1) if it can be parsed by an LL(1) parser, period. It makes no sense for us to discuss language theory.
It is because you asserted (and nobody has disagreed) that LL(1) grammars are restrictive in expressive power that the discussions over a quest for other kinds of grammars (theoretical language sets) and parsing strategies started.
That’s an non-issue if you’re willing to consider PEG, which could be larger than CFG, and probably linear in memory and time for a language like Python.
My experience is that
LL(small-k) is so restrictive that the only way to know if a grammar complies is to pass it through the
LL(small-k) parser generator. LL and LR are outstanding theoretical and practical achievements from the epoch of shared computer time and 256K RAM for a whole computing department. Today is now.
Although I sketched a plan I can work on my own on this discussion thread, I made sure to at least try to reconcile it with what you requested, which I think I’ve several times summarize as something like:
- Python parsing is old, and there’s an opportunity to modernize it before 3.9 or 4.0.
- Among the reasons to modernize, the expressive power of LL(1) is poor for current Python.
- How about going directly to AST considering that CST is only used during parsing?
- Do we actually need the tokenizer?
There you go!