- to find all documents which starts with alphabets and underscore, the result is also giving me the onnes with numbers in the middle. why?
doc=‘adsbhif, efjne_rjnfrinjfr2 fvwn33e4 nj-fnjwf’
regex= ‘[a-zA-Z_]/w*’
re.findall(regex, doc)
doc=‘adsbhif, efjne_rjnfrinjfr2 fvwn33e4 nj-fnjwf’
regex= ‘[a-zA-Z_]/w*’
re.findall(regex, doc)
What is the answer you expected from the example data?
What do mean when you say document
?
You may need to add ^
to anchor the search to that start of the string. But the findall would not be the right function to use.
You may need to include digits as well as \w word chars, [\w\d]*
.
\w
matches letters, digits (\d
) and ‘_’.
Step by step, what do you think your regex means, and how did you decide to put each part of it when you were writing it?