I have the following strings that I would like to match and replace. I am able to match the strings but struggling to carry out a replacement - the output either doesn’t change or it completely erases sections of the input string. i have included an example of one of many options I have tried.
Note that the pattern I posted is not identical to yours:
# This is all one group, hence why \2 doesn't work in the replacement.
your_pattern = r"^\([A-Za-z0-9]+\)\([A-Za-z0-9]+\)\([^)]*\)[a-zA-Z]+$"
# ( Group 1 )( Group 2 )
my_pattern = r"^(\([A-Za-z0-9]+\)\([A-Za-z0-9]+\)\([^)]*\))([a-zA-Z]+)$"
A replacement string of \1* using your original pattern results in the following output: (BXXADDB)(BXXXCAC1)(CXX2A)CANEVER*
which is not what you want.
If you copy what I wrote exactly you will get no error.
For scenario two, consider the following pattern:
pattern = r"^([a-zA-Z0-9]+\(),([^)]*\))$"
# Same pattern with explanations
pattern = re.compile(
r"""^ # Match beginning of line
( # Group 1 start
[a-zA-Z0-9]+ # Match an alphanumerical string of arbitrary length
\( # Match an opening parenthesis
) # Group 1 end
, # Match a single comma
( # Group 2 start
[^)]* # Match a string of arbitrary length until reaching a closing parenthesis
\) # Match a closing parenthesis
) # Group 2 end
$ # Match end of line""",
re.X,
)
Can you see what the replacement string should be to get the desired result?
Not quite. If you remove the comma from the pattern it won’t match the comma, or anything after it, as you noticed.
Instead, note that the two groups contain everything except the first comma inside the parentheses. For the replacement string, you just need to put the two groups together: re.sub(pattern, r"\1\2", string)