Python regex library is not returning the result properly

Hi,

I first time use the regex library in python and basically, I am using the regex library in C++17 also and that is giving me the expected result as an output of the same function. I am providing the example below:

Python implementation:
---------------------------

import re
def CheckUsername(name):
    LOGIN_NAME_MAX = 128
    if not name or len(name) > LOGIN_NAME_MAX:
        return False

    pattern = "^[a-zA-Z_][a-zA-Z0-9_-]*[$]?"
    return bool(re.match(pattern, name))

C++ implementation:
---------------------------

#include <regex>
static bool
CheckUsername(const string& name)
{
   if (name.empty() || name.length() > LOGIN_NAME_MAX) {
      return false;
   }
   // NOTE: This regex was suggested in the useradd man pages
   regex re("^[a-zA-Z_][a-zA-Z0-9_-]*[$]?");
   return regex_match(name, re);
}

Test Input

  1. “temp”
  2. “temp@”

Expected result in both Python and C++

print(CheckUsername("temp")) # Output: True
print(CheckUsername("test@")) # Output: False

As I told you earlier the “test@” is returning me “True” in python but the expected result should be “False”. I think there is an issue somewhere.

Thanks,
Amber

I think you want to use re.fullmatch or terminate the match with $:

^[a-zA-Z_][a-zA-Z0-9_-]*[$]?$
1 Like

This is the standard username validation regex so I took this from a man page. So the C++ function is giving me the expected result but not the python. I added in the expected result if both the test inputs are able to provide the expected output (written next to the function as a comment) then we are good to go.

re.match() tests whether the regular expression matches the beginning of the string. Use re.fullmatch() to test whether it matches the whole string.

^ at the beginning of the regular expression used with re.match() or re.fullmatch() is redundant.

I wonder whether you intentionally accept names like “test$” as a username?

2 Likes

Hello Serhiy,
Thanks for giving me the idea of fullmatch. Actually, i was directly converting C++ regex.match() to python re.match(). Now, i am getting the exact result after substituting fullmatch in place of match.

Thank you so much team.

1 Like

match() vs fullmatch() isn’t your issue.

The issue is [$]? is zero or one dollar sign characters, not an input termination anchor. Your current code using matchall() would accept ‘temp$’ incorrectly, unless you want it to accept the optional dollar sign.

replace [$]? with simply $ and match() works properly.

Actually, it was :wink: regex_match in C++ works like Python’s fullmatch. In C++ you need regex_search to act like Python’s search. There is no direct equivalent in C++ to Python’s “match”. Although C++ “regex_search” can be made to act like Python’s “match” by changing the regexp to start with a caret (“^”).

Test with the change I suggested. It works properly.

Unless the OP actually wants an optional dollar sign character. In that case, it simply gets a bare $ added to the string.

Do you want to accept ‘temp\n’?

python regex doesn’t properly support multiline input without telling it to
by using flags=re.DOTALL or something similar, otherwise it’s per line.