Confusion on a dictionary concept question

Hi, I’m confuse at this concept of dictionary in this context in my code. Here I have a if-statement nested inside my for loop. In the if-statement where it tests the conditional “reviews_max[name] < n_reviews”, how come the dictionary (‘reviews_max’) automatically knows to use the ‘Review’ column (index 3) for the list of lists, ‘dataset’, to do this comparison with the ‘n_reviews’ variable? Why doesn’t python in this case use index 2 (‘Ratings’) instead to do this comparison in this code? Is it “implicitly” implied that when you set n_reviews = float(rows[3]) that the python dictionary is going to automatically assume or use index 3 as the value for ‘reviews_max[name]’ to do this comparison? Here is the full code:

[code]

# Example dataset (list of lists)

dataset = [
    ["App Name", "Category", "Rating", "Reviews"],
    ["Facebook", "Social", 4.5, 78158306],
    ["Instagram", "Social", 4.7, 66577313],
    ["WhatsApp", "Communication", 4.4, 119400128]
]

# Initialize an empty dictionary to store maximum reviews for each app
reviews_max = {}

# Iterate over the dataset (excluding the header row)
for row in dataset[1:]:
    # Extract app name (from first column) and number of reviews (from fourth column)
    name = row[0]  # Assuming app name is in the first column
    n_reviews = float(row[3])  # Assuming number of reviews is in the fourth column
    
    # Check if the app already exists in reviews_max and update if necessary
    if name in reviews_max and reviews_max[name] < n_reviews:
        reviews_max[name] = n_reviews
    # If the app is not in reviews_max, add it with its number of reviews
    elif name not in reviews_max:
        reviews_max[name] = n_reviews

# Print the resulting dictionary
print(reviews_max)

[code]

The dictionary doesn’t “know” anything, and doesn’t “use” anything.

Each time through the for row in dataset[1:]: loop, row means one of the rows of the dataset. Then we set n_reviews = float(row[3]), so n_reviews means the last value from that row (which is 78158306 the first time, 66577313 the second time and 119400128 the third time), converted to a float. Similiarly, name comes from the first value of the row, which is "Facebook" the first time, "Instagram" the second time, and "WhatsApp" the third time.

Then, the code tries to look for a value stored in reviews_max with the name key. For the example data, we have a different name each time, so it will never find an already-stored value, and just store the initial value each time. But if we had another row with the same name, the code would find the existing value stored in reviews_max under that name, and compare it to the n_reviews value for this time through the loop.

In short: the dictionary only ever stores review values, therefore only review values are ever retrieved from it (and compared with other review values).

This is the part where I am confuse and where you say, " Then, the code tries to look for a value stored in reviews_max with the name key."

At this juncture does it mean that the code see’s the “key” in the dictionary and automatically retrieves the value associated with it, in this case at index 3 (which is ‘Reviews’ column) in ‘dataset’? Is this retrieval of value from index 3 from ‘dataset’ due to the early statement in the code: ‘n_reviews = float(rows[3])’?

But what happens if I said:

n_reviews = row[2]

instead of n_reviews = float(rows[3])

Would it then retrieve the values from index 2 to do the comparison?

reviews_max has nothing to do dataset. They are completely separate.

What you’re doing is getting values from dataset and using them change what’s in reviews_max.

Yes.

No. At this point in the code, dataset is completely and utterly irrelevant.

The value stored in the dictionary is the one that was stored there when the reviews_max[name] = n_reviews code ran. The dictionary stores the value, not a name - it doesn’t care about the n_reviews variable, or about where that value came from; it cares about the actual object (the float that was created by doing float(row[3]).

The “retrieval” has already happened, either way.

It is the same as how if you do:

a = 1
b = 2
c = a + b

now c will get the value 3 assigned to it - a computed result, that doesn’t care about the a or b variables. If anything happens to a or b after that, it has no effect on c.

Even if we pull something out of another data structure:

a = [1, 2, 3]
b = []
b.append[a[0]]

Now the b list contains the number 1, because that’s what a[0] found. If we then put something else into a[0], it doesn’t affect b. To affect b, we’d have to somehow change the object for the integer 1 itself (not possible from within Python).

You can print out the reviews_max variable each time around the loop to see what is in dictionary as the code runs, not just at the end.

Printing variables like this is often a good way to understand how the code is working. The output can be used to confirm that the code is, or is not, working as expected.

The question is invalid because reviews_max dict does not use anything in the list of lists; Only name andn_reviews extracted from a particular list. Reread carefully.