Perform Case Insentive Search

This code works and returns 5 lines before the string ‘JUMP’, but need to make the search case-insensitive.

Best, Dave

from collections import deque

def search(lines, pattern, history=1):
    previous_lines = deque(maxlen=history)
    for line in lines:
        if pattern in line:
            yield line, previous_lines
        previous_lines.append(line)

# Example use on a file
if __name__ == '__main__':
    with open('test.txt') as f:
        for line, prevlines in search(f, 'JUMP', 5):
            for pline in prevlines:
                print(pline, end='')
            print(line, end='')
            print('X'*20)

test.txt

I saw the cow
he did jump
over the moon

1
2
3
4
5
6
7
8
9
10
jump

nothing here
1
2
3
4
5
JUMP

Perhaps if pattern.lower() in line.lower()

1 Like

Thanks

Thought to add a regular expression, but it errors.

search.re(f, (/JUMP/i), 5):

Best,
Dave

That’s because that isn’t valid Python code. See the “re” module for
using regular expressions in Python.

Definitiely the simplest and easiest thing is André’s suggestion -
lowercase both strings, then search.

Only reach for regular expressions when things get quite difficult
without them. For anything complex they’re cryptic and error prone. Not
to mention expensive - they’re inherently more complex than simple fixed
string stuff. Of course, there is a threshold where they’re a sensible
choice, but too many people reach for them as their first choice.
Generally they are a middle ground: undesirable for simple stuff,
suitable for some more complicated stuff, and undesireable again for
serious parsing (eg language grammars).

Cheers,
Cameron Simpson cs@cskk.id.au

Hi André,

The casefold method of strings is better for case-insensitive testing.
It is similar to lowercase, but covers some unusual cases such as German
sharp s ß better.

Even there, things like Turkish dotted i and dotless ı won’t work
correctly, so if you are dealing with Turkic languages, you may need
some custom functions.

1 Like

That did it! Thanks guys. Best, Dave

> from collections import deque
> import re
> 
> def search(lines, pattern, history=1):
>     previous_lines = deque(maxlen=history)
>     for line in lines:
>         if pattern.lower() in line.lower():
>             yield line, previous_lines
>         previous_lines.append(line)
> 
> # Example use on a file
>  
> 
> if __name__ == '__main__':
>     with open('test.txt') as f:
>         for line, prevlines in search(f, 'JUMP', 5):
>             for pline in prevlines:
>                 print(pline, end='')
>             print(line, end='')
>             print('X'*20)