takes off noob packaging contributor hat
puts on Pygments committer hat
Pygments (the syntax highlighting library) uses regular expressions pervasively. There have been discussions about switching from the stdlib
re module to the third-party regex module, which is a C extension module.
Pip depends on Pygments, through
rich. Pip also has an obvious bootstrapping problem and solves this problem by vendoring all its dependencies.
If Pygments started to depend on an extension module, could pip still use it? Alternatively, I’m guessing pip could patch
rich to not require Pygments; a cursory look at the code suggests this would not be difficult, but I’m not totally sure.
Have there been discussions about moving
regex into the stdlib? Isn’t it at this point strictly better than the stdlib version?
Perhaps the Pygments dependency in
rich can be made optional (or relegated to an extra)? I don’t think code highlighting should be required for pip to function (or even fancy terminal formatting, for that matter).
To answer the direct question here, pip wouldn’t be able to vendor the extension, so it would no longer be able to vendor pygments. Ideally rich would make pigments an optional dependency, but if not we’d have to look at either patching or pinning an old version of pygments, neither of which is ideal.
The immediate practical answer is we wouldn’t upgrade our vendored copy of pygments until we found the resource to address the issue.
Could pygments add the new dependency as desired but keep fallback code for the missing import?
Not really. The reasons for switching to
regex include its richer regular expression syntax (e.g., with Unicode category character classes and more control over backtracking). If we start using these features in the thousands of regexes that Pygments contains, I don’t see us maintaining parallel lexers that use
I have no idea how hard it is, but we should push(or contribute to)
rich to make
pygments an optional dependency.
This has been suggested before, [REQUEST] Add a [minimal] version of rich without commonmark and pygments · Issue #2277 · Textualize/rich · GitHub but it would be breaking backwards compatibility since it wouldn’t default to have those included. Seems like it would be useful for extras to exclude dependencies, but currently that isn’t possible AFAIK.
I think rich can do that, if we can also figure out how default extras should work.