apalala
(Juancarlo Añez)
April 28, 2019, 1:00pm
56
apalala:
I’ll keep exploring ideas, time permitting. PEG provides for flexible and clear grammars, and easy-to-debug/maintain parsers, and it doesn’t require separate tokenization. Yet parsing requires memoization to be performant. Could not building a CST compensate for the memoization since the memos would already be AST? A good experiment would be to produce a separate Python->AST parser with PEG just to take a look at the times and memory; if they’re not way off, it could be an approachable strategy.
I missed PackCC in my previous search. I’m leaving a note here:
PackCC is a parser generator for C. Its main features are as follows:
The grammar of an output parser can be described in a PEG (Parsing Expression Grammar). The PEG is a top-down parsing language, and is similar to the regular expression grammar. Compared with a bottom-up parsing language, like Yacc's one, the PEG is much more intuitive and cannot be ambiguous. The PEG does not require tokenization to be a separate step, and tokenization rules can be written in the same way as any other grammar...
The tool has the correct set of features, and it’s cool that it’s implemented as a single .c
file. The license is MIT: PackCC download | SourceForge.net