Compare two lists

I have 2 lists and need to compare them, to print out which lastnames and firstnames are missing for the second list. Note that the order of the lastnames and/or firstnames (separated by comma) elements can be random. The output should be something like:

MISSING ENTRIES
---------------------
lastname: Joseph missing: Matthew
lastname: Richard missing: Charles
lastname: Michael missing completely

The code so far is:

import pprint

data1 = [
    {"lastname": "Joseph", "firstnames": "Thomas, Christopher, Matthew, Anthony"},
    {"lastname": "Richard", "firstnames": "Robert, William, Charles"},
    {"lastname": "Michael", "firstnames": "James, Daniel"},
]

data2 = [
    {"lastname": "Richard", "firstnames": "William, Robert"},
    {"lastname": "Joseph", "firstnames": "Christopher, Anthony, Thomas"},
]

print("###############################################################")
pprint.pprint(data1)
print("###############################################################")
pprint.pprint(data2)
print("###############################################################")

for row in data1:
    for row2 in data2:
        if row["lastname"] not in row2["lastname"]:
            print("lastname: " + row2["lastname"] + " missing: " + row["firstnames"])

To find names which exist in data1 but not in data2, it’s convenient to use sets.

names1 = {n["lastname"]: set(n["firstnames"].split(", ")) for n in data1}
names2 = {n["lastname"]: set(n["firstnames"].split(", ")) for n in data2}

Then, use Python’s built-in set operators to calculate the difference between them:

for n in names1:
    if n in names2:
        if missing := names1[n] - names2[n]:
            print(f"lastname: {n} missing: {' '.join(missing)}")
        continue
    print(f"lastname: {n} missing completely")

Output:

lastname: Joseph missing: Matthew
lastname: Richard missing: Charles
lastname: Michael missing completely
2 Likes

Thank you so much! Merry christmas and happy new year!

1 Like