Intro/Problem
Recently I was using pattern matching for the first time to parse through json documents, and I enjoyed it very much, finding it natural for the problem at hand.
I did however find myself trying to do something with a sequence pattern that just isn’t possible according to PEP 634. What I was trying to do was find the first occurrence of a subpattern within a sequence and bind it.
Example
The example json document is one of the follows and I was trying to parse them
# sources is a single string that is a url
{ "sources": "some string"} |
# sources is an object that has an attribute called url containing a string
{"sources": {"url": <some string>} }
# sources is an array of a single string
{"sources": [ "some string"]}
# sources is an array of a multiple strings
{"sources": [ "some strings1", "some strings2", ..., "some stringsN"]}
# sources is an array containing documents that at least one of which has
# an attribute called "url" containing a string to a url
{"sources": [{"url": "some string"}]
Current Solution
This is a copy and paste from what I am currently working with
# NOTE need to determine how to check if the string is a valid url within the pattern
match sources:
case str() as url:
...
case {'url': str() as url}:
...
case [str() as url]:
...
case list():
# can't pattern match into the list more :(
# assuming first string found with "url" in it, is what we want and go with that
search = next((x for x in sources if "url" in x), None)
match search:
case {'url': str() as url):
...
case _:
# anything else means its a string, list of string, or None
# and I doubt the homepage has "url" in it directly
raise KeyError
case _:
raise KeyError
For the first 3 cases, pattern matching works wonderfully but it was less than ideal when trying to match a sequence pattern. When I first got into this I naively thought I would be able to check if any item in a sequence matched a subpattern and bind it. Reading PEP 634 more I realized the sequence pattern is heavily dependent on absolute position and as such I had to do what I did above.
Proposal
Add a next
and filter
pattern that would respectively
- Match at the first item in sequence that matches the subpattern(s), else go to next case
- Check all items in sequence against the subpattern(s), if there are any then match, else go to next case
I believe this would be very natural and useful to extend pattern matching on sequences