I have started using RegEx module (last 2020 module) in Python 3.8.1, but for some reason I can’t manage to resolve the simple expression to select only consonants.
I have tried with these pieces of code:
s = “test string”
f = regex.findall("[a-z]–[aeiou]",s)
print(f)
s = “test string”
f = regex.findall("[[a-z]–[aeiou]]",s)
print(f)
s = “test string”
f = regex.findall("[a-z–aeiou]",s)
print(f)
s = “test string”
f = regex.findall("[a-z]&&[^aeiou]",s)
print(f)
In each case, I get an empty list.
Can someone kindly tell me where is my error? Something about escaping characters? I don’t have a clue…
There is a note in the standard library documentation:
Support of nested sets and set operations as in Unicode Technical Standard #18 might be added in the future. This would change the syntax, so to facilitate this change a FutureWarning will be raised in ambiguous cases for the time being. That includes sets starting with a literal '[' or containing literal character sequences '--' , '&&' , '~~' , and '||' . To avoid a warning escape them with a backslash.
So the feature that you are after isn’t part of the normal regular expression repertoire and isn’t implemented. I’m afraid that if you want a set of consonants, you are going to have to write it explicitly: [bcdfghj-np-tv-z]
Hi James!
Thank you so much for your super-swift answer!
To be honest, I found it to be possible.
You need to enable the version 1 of regex module, that is by default set to regex 0. Then, all the expression I have posted above are correctly processed.
See the following for reference: