A little help with a simple intersection

hox · March 7, 2023, 10:38pm

I have:

with open('file1.txt', 'r') as file1:
    words1 = set(file1.read().split())

with open('file2.txt', 'r') as file2:
    words2 = set(file2.read().split())

common_words = words1.intersection(words2)
print(common_words)

Which works, however I need some assistance in 1) finding the common_words such that case doesn’t matter and 2) that the ordering of the intersected words comes from the found order of file2

file1 has an alphabetical list of words, file2 is several paragraphs of text.

TIA - hox

hox · March 7, 2023, 10:47pm

I got a little antsy and put in my request to chat.openai.com. It gave me the following which seems to work:

with open('file1.txt', 'r') as f1, open('file2.txt', 'r') as f2:
    # Read the contents of each file and split them into words
    words1 = set(f1.read().lower().split())
    words2 = f2.read().lower().split()

    # Find the common words between the two files
    common_words = [word for word in words2 if word in words1]

    # Output the common words in the order they are found in file2
    for word in common_words:
        print(word)

How does it look to the pros in here?
tia,
hox

cameron · March 7, 2023, 11:20pm

I’d open the files one at a time, unless you want to quit immediately if
you can’t open both. Personally, I’d consider that unlikely in normal
use so I’d just read each file on its own.

words1 looks fine. For a large file it is more memory efficient to
read it a line at a time instead of reading the whole file into memory
with f1.read(), eg:

 words1 = set()
 with open('file1.txt', 'r') as f1:
     for line in f1:
         words1.update(line.lower().split())

You can do the same to make words2.

To recite the intersecting words in file2 order, you’ve got 2
approaches:

load all the file2 words, then intersect with words1 - don’t forget
that sets have an intersection operation, the scan the words from
file2 (using a separat list of those words you also kept) and print or
not depending if they’re in the intersection set
read file2 progressively, iterating over the words; if a word exists
in words1, print it and (maybe) discard it from words1 if you do
not want to repeat a word; thos obviates any need for a words2 set
etc

Cheers,
Cameron Simpson cs@cskk.id.au

Topic		Replies	Views
intersection of 2 lists without using the in operator or any built-in functions Python Help	6	1166	April 13, 2022
Help solving a task🙏 Python Help help	5	403	May 6, 2023
How do i get rid of the duplicates in the list and append words to the empty list Python Help	3	1812	May 12, 2020
Lists, dictionaries,tuple, sort, Python Help	6	485	December 18, 2021
Help with a programme Python Help	2	522	December 12, 2021

A little help with a simple intersection

Related Topics