devdave
(DJW)
November 11, 2022, 5:18pm
1
I asked this in help and apologize if this is not the correct place to ask this.
I have a semi-working tokenizer and thanks to LibCST a nearly rock solid parser implemented.
In Python/tokenizer.c, fstrings are wrapped up as dumb ‘f"Hello {place}"’ string that are parsed somewhere else.
Somewhere after that middle man parser, fstrings go through the PEG parser with this rule cpython/python.gram at main · python/cpython · GitHub but that rule isn’t prepared to handle ‘f"Hello {place}"’.
The Python in Rust project steps character by character over a fstring to create a mini-ast which makes me think cPython does the same but I haven’t been able to find where.
For the curious, my code base is at github under user devdave, project name rython4. The antispam filter won’t let me post more than two hrefs to a post so that’s why there isn’t a direct link.
2 Likes
pablogsal
(Pablo Galindo Salgado)
November 12, 2022, 9:30pm
2
Currently, this happens here:
#include <stdbool.h>
#include <Python.h>
#include "tokenizer.h"
#include "pegen.h"
#include "string_parser.h"
//// STRING HANDLING FUNCTIONS ////
static int
warn_invalid_escape_sequence(Parser *p, const char *first_invalid_escape, Token *t)
{
unsigned char c = *first_invalid_escape;
int octal = ('4' <= c && c <= '7');
PyObject *msg =
octal
? PyUnicode_FromFormat("invalid octal escape sequence '\\%.3s'",
first_invalid_escape)
: PyUnicode_FromFormat("invalid escape sequence '\\%c'", c);
This file has been truncated. show original
But this will change soon as we are working to integrate this with the PEG parser
devdave
(DJW)
November 13, 2022, 6:38pm
3
Is the plan to try and parse something like f"Hello {place=}" with the PEG parser or more like the Python in Rust project where I think they break it down to a AST?
pablogsal
(Pablo Galindo Salgado)
November 13, 2022, 7:06pm
4
The idea is to parse it with the peg parser itself yeah
devdave
(DJW)
November 14, 2022, 6:44pm
5
@pablogsal Any chance there is an experimental fork/branch with this new ruleset for parsing fstrings?
pablogsal
(Pablo Galindo Salgado)
November 15, 2022, 2:43pm
6
There is:
But don’t draw any conclusions from it yet, we are still heavily working on it.
1 Like
devdave
(DJW)
November 15, 2022, 7:04pm
7
This is what Instagram’s LibCST developers tried for parsing f-strings - LibCST/grammar.rs at main · Instagram/LibCST · GitHub