Question about (?!... regexp

Alet · February 4, 2023, 11:54am

I can’t understand how it works. For example:

on file:

sample
sample="some dont_match"

the code:

#!/usr/bin/python3

import re

file1=open('file','r')
lines=file1.readlines()

for line in lines:
    print(re.match('^(sample="(.*?)(?!match"))', line))  #***

gives matching. But if the line (***) change on

    print(re.match('^(sample="some dont_(?!match"))', line))  #***

it works as planned
Is there any way give a group and don’t match if some text is included?

MRAB · February 4, 2023, 6:23pm

.*? is a lazy match. It’ll initially ‘consume’ 0 characters, so you’ll be comparing (?!match") against some dont_match", which will succeed.

I think what you want is ^(sample="(?!.*?match").*?"). This looks ahead for match" and then fails if it was successful.

Alet · February 5, 2023, 9:33am

No, I need a group before pattern for disable. The group will used later

Marco_Sulla · February 5, 2023, 3:14pm

Named groups must be written this way:

(?P<name>...)

Regular Expression HOWTO — Python 3.12.1 documentation

(?!...)

Negative lookahead assertion. This is the opposite of the positive assertion; it succeeds if the contained expression doesn’t match at the current position in the string.
Regular Expression HOWTO — Python 3.12.1 documentation

so I suppose it’s not what you need.

Maybe you’re looking for:

>>> re.match(
...     r'^(?P<sample>some dont_(?:match))', 
...     "some dont_match"
... )
<re.Match object; span=(0, 15), match='some dont_match'>

Alet · February 6, 2023, 8:19am

I don’t necessary need the named group
Actually I need content of group if it is not followed by text. The question is why it doesn’t work with quantifiers?

Alet · February 6, 2023, 11:03am

My mistake, the file for test should be:

garbage
sample="some do"
sample="some dont_match"

On first and last line the script should not match, on the second it should

Marco_Sulla · February 6, 2023, 1:22pm

re.match(
    r'^(?P<sample>some do(?!match))', 
    "some do"
)

I hope it’s not homework

Alet · February 6, 2023, 2:03pm

No )) pet project

Alet · February 15, 2023, 11:30am

Sorry for my persistence, but is any way to use wildcard with negative lookahead?

cameron · February 15, 2023, 8:34pm

Can you provide an updated example of what you’re trying to match, and
example which should not match, and what you’re trying as a regexp?

Cheers,
Cameron Simpson cs@cskk.id.au

Alet · February 16, 2023, 7:26am

The main reason is adding the substring if it is not present. But the exact content of line is not known. The sample file:

some_unneeded_format
sample="some"
sample="some dont_match"

First line should not match because it isn’t in right format. Second line should match with text “some” stored somewhere. Third line already contains substring “dont_match” so it should not match.

Last regexp:

print(re.search('^(sample=".*?(?!match"))', line))

But it match with third line

Thanks,
Alexander

Marco_Sulla · February 16, 2023, 8:09pm

Instead of .*?, try [^"]*+, available in Python 3.11 or with the Pypi extension regex:

cameron · February 16, 2023, 9:33pm

The main reason is adding the substring if it is not present. But the
exact content of line is not known. The sample file:
some_unneeded_format
sample="some"
sample="some dont_match"
First line should not match because it isn’t in right format. Second line should match with text “some” stored somewhere. Third line already contains substring “dont_match” so it should not match.

Thank you for these examples.

Last regexp:
print(re.search('^(sample=".*?(?!match"))', line))
But it match with third line

This is because you’re not thinking about the way patterns match
correctly. Patterns backtrack: they try for the longest sequence which
will match your pattern, but if they don’t match at the longest extend
they will try a shorter extent which is still valid.

To take your third misbehaving example, your pattern doesn’t match:

 sample="some dont_match"

but it does match:

 sample="some dont_match

by letting the .*? match this text:

 some dont

which lets it “not match” the match" against the text _match.

(Actually, you’ve got a nongreedy pad .*?, so in fact it will match an
empty string right after your double quote, because the text some dont_match is not a match for match".

What I think you want to say is: do not match if the string match"
occurs anywhere later in the text. So the pattern you want to “not
match” looks like:

 .*match"

Note the leading .* to allow that “not match” pattern to span to any
point in the text.

So I moved your .*? to inside the (?!....) section:

 import re
 for line in r'''
 some_unneeded_format
 sample="some"
 sample="some dont_match"'''.split("\n"):
     print(line)
     print(" ", re.search('^(sample=".*?(?!match"))', line))
     print(" ", re.search('^(sample="(?!.*?match"))', line))

Personally I don’t use the nongreedy forms very often, and tend to write
.* instead of .*?; I find the greedy forms easier to think about.

BTW, I recommend always using “raw strings” to express your regexps, eg:

 r'^(sample="(?!.*?match")'

As soon as you’ve got a slosh-escape in your regexp they pay off.

Cheers,
Cameron Simpson cs@cskk.id.au

Alet · February 18, 2023, 6:21pm

Thanks for solution and sorry for late answer

I’m trying realize this in project, maybe questions will arise

Alet · February 20, 2023, 10:31am

There are some issues with this solution. I need the string before pattern. If I use:

print(" ", re.search('^sample="(?!.*?match"))', line)

and try to reference by \1 there is an error ‘invalid group reference’ arise. If I use:

print(" ", re.search('^(sample="(?!.*?match"))', line))

the group reference doesn’t contain the text before negative match (“some dont_” in this case)

Alet · March 12, 2023, 9:52am

Solution:

https://stackoverflow.com/questions/53032368/adding-a-parameter-to-a-line-with-ansible-with-lineinfile-function-and-regexp

Topic		Replies	Views
A Regular Expression Problem Python Help	4	460	September 5, 2023
How to match a string that does not end with a substring Python Help	0	269	February 26, 2023
Need help with my code (Regular Expression) Python Help	3	337	April 4, 2022
Is this the correct way to document regex use Python Help	6	360	August 24, 2022
Help with regex search Python Help	2	325	October 5, 2021

Question about (?!... regexp

Related Topics