Is this the correct way to document regex use

I’m not worried about the code working, just the documentation of the regex. Is this acceptable or is there a better way to document if one is to use regex.

import math
import re
MODULE_NAME = math

commands_list = []
for name in dir(MODULE_NAME):
    if re.search("(\b[a-z][^_A-Z0-9]+\B_[a-z]+\B_[a-z]+)", name):
#   Match the regex below and capture its match into backreference number 1 «(\b[a-z][^_A-Z0-9]+\B_[a-z]+\B_[a-z]+)»
#    Assert position at a word boundary (position preceded or followed—but not both—by a Unicode letter, digit, or underscore) «\b»
#    Match a single character in the range between “a” and “z” (case sensitive) «[a-z]»
#    Match any single character NOT present in the list below «[^_A-Z0-9]+»
#       Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
#       The literal character “_” «_»
#       A character in the range between “A” and “Z” (case sensitive) «A-Z»
#       A character in the range between “0” and “9” «0-9»
#    Assert position NOT at a word boundary (position both preceded and followed—or both not preceded and not followed—by a Unicode letter, digit, or underscore) «\B»
#    Match the character “_” literally «_»
#    Match a single character in the range between “a” and “z” (case sensitive) «[a-z]+»
#       Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
#    Assert position NOT at a word boundary (position both preceded and followed—or both not preceded and not followed—by a Unicode letter, digit, or underscore) «\B»
#    Match the character “_” literally «_»
#    Match a single character in the range between “a” and “z” (case sensitive) «[a-z]+»
#       Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
        print(name)
    # Successful match
    else:
        print('Match failed ', name )

Should the documentation for the regex be after as its use, as I did, or before it. I’m wanting to learn to do things right the first place rather than haveing to re-learn things.
Thanks,

I prefer something like this:

name_search_re = re.compile(r"""
    (                     # Group 1
        \b                # Assert word boundary
        [a-z]             # One lower case character in a-z
        [^_A-Z0-9]+       # One or more characters NOT _, A-Z or 0-9
        \B                # Assert NOT word boundary
        _                 # Literal underscore
        [a-z]+            # One or more characters in a-z
        \B                # Assert NOT word boundary
        _                 # Literal underscore
        [a-z]+            # One or more characters in a-z
    )                     # End of group 1
    """, re.VERBOSE)

if name_search_re.search(name):
    ...

But I don’t think there is a universally correct or incorrect way.

4 Likes

I also think that the original goes into way too much detail and explanation of what the various bits mean. It’s meant to be a comment, not a tutorial! :slight_smile:

I understand your thoughts about to much information. That is why I asked questions. My concern was that many programmers have difficulties with regex. So giving as much imformation as possible (hold there hand with excess information) not knowing the level of expertise a programmer needing to analize the regex.
Thanks for your insight,

(The common recommendation “comment the why” is also applicable I think.

[a-z]+  # id part. Must be lowercase because RFC xyz section Q

is more helpful. If the regex contains just not just common elements, e.g. if \B is considered exotic, a reminder / keyword to lookup is great, but the reasoning might still be included:

\B       # Not word boundary. Example: abc123 should not match because ...

)

2 Likes

I think it’s normal to put comments before the code that it refers to.
For regex I like to include examples of each use case that should be matched and not match.

Wouldn’t the ‘why’ be covered in the doc section? So it should be duplicated in both, if I’m thinking correctly. I’m never going to use Python for work, just doing this for personal use. But if you don’t have a set of guides or rules you follow one gets a mess after a while. I happen to have memory problems so comments are a little verbose for most. I can see that.
That being said, set a standard so your code is maintainable, readable and concise.
I’m learning and starting to get a mental picture of what and how to write effectively. I’m reading about pragmatic programming currently.