Request for review: support for ** wildcard in pathlib.PurePath.match()

If anyone likes a spot of regex, I have an open PR that adds support for the ** wildcard in pathlib.PurePath.match(). This much requested feature would make match() support the same wildcards as Path.glob() and rglob(). It substantially improves the performance of match(), and can be used to improve the performance of rglob() in a follow-up PR. It employs a cute trick – swapping newlines and path separators, and compiling a re.Pattern object without setting DOTALL. It’s a great chef and an attentive listener with a contagious laugh – basically the perfect PR! Would anyone be willing to review? Thanks.

4 Likes

Performance improvements are still okay for 3.12 - do you think it’s worth a careful backport to bring back the regex changes but not allow **?

Or perhaps it’s okay to allow the ** in 3.12 (i.e. backport as it) since it’s currently an error and so won’t impact working code? @thomas thoughts?

2 Likes

Thanks Steve! Backporting just the performance improvement is easy enough. It irks me to put performance improvements in during beta, though, and I’d rather make this a 3.13-only thing if that’s an option?

1 Like

Shouldn’t do. Provided we aren’t changing the ABI or API surface, it’s fine.

Beta is the best time to find out whether we’re inadvertently breaking behaviour with an optimisation. We just can’t go turning code that currently works into code that unconditionally fails (edge cases are up for discussion).

Edited to add that it’s worth noting that each beta gets progressively worse for this, and we need to ramp up the confidence that we aren’t changing behaviour at the same time.

2 Likes