# Which data structure will be better to solve this question?

You just entered a party in which there are đť‘› people, It is known that there is exactly one celebrity in the
party and you are determined to know him. You are permitted to ask the question â€śDo you know him?â€ť.
Write an algorithm to find the celebrity in the minimum number of questions. We define a celebrity as
the one who is known to everyone in the party and who doesnâ€™t know anyone.

I was recently asked this question in my CS exam, I wanted to know if it is possible to solve this question using `array` data structure. If yes, is it not easier to solve this problem using a `graph` ?

An approach to solve this question will be really appreciated.

Wellâ€¦

You could solve it with either an array or a graph, depending how you
make them.

Think about the criteria for â€ścelebrityâ€ť. What is it expressed in? It
looks like a pair of counts: how many people does this person know? How
many people know this person?

question, what you learn can be expressed by adjusting the counters for

How many counters are there? How might you store those counters so that
you can find them again to adjust them or query them? What data
structure might store those counters?

Cheers,
Cameron Simpson cs@cskk.id.au

Yes, you could solve this with the array module:

https://docs.python.org/3/library/array.html

but nested lists would probably be better. Another solution would be to

use graphs. Here is a very old essay on how to work with graphs in

Python:

https://legacy.python.org/doc/essays/graphs/

But I think that the minimum number of questions needed is to ask each

person in turn â€śDo you know anyone in this party?â€ť and stop as soon as

someone answers â€śNoâ€ť. That person is the celebrity.

*We define a celebrity as the one who is known to everyone in the party

and who doesnâ€™t know anyone.*

There is exactly one such celebrity. So as soon as you find the person

who knows nobody, they must be the celebrity.

There canâ€™t be two people who answer â€śNoâ€ť to the question, because that

would mean that there is at least one person who doesnâ€™t know the

celebrity, which is not allowed from the first condition that everyone

knows the celebrity.

So an easy way to solve this doesnâ€™t require graphs, it just needs a

dict, strings and lists. Represent each person as a string (their name).

Create a dict with the keys being each person, and the values being a

list (or a set) of who they know:

``````d = {'Bob': ['Michelle', 'Homer', 'Johnny'],

'Michelle': ['Marge', 'Johnny'],

'Homer': ['Marge', 'Bob', 'Samuel', 'Johnny'],

'Johnny': [],

...

}
``````

So Johnny is the celebrity.

``````for person, people_they_know in d.items():

if len(people_they_know) == 0:

print("The celebrity is", person)

break
``````

Assuming the data is correct, you donâ€™t need anything more than finding

the person who knows nobody, but if you want to verify that the data is

consistent with the problem statement, you can then confirm that

everybody knows Johnny using a similar loop. But if the problem isnâ€™t

lying to you, then this is unnecessary.

Another way is to use nested lists to make a communication matrix where

0 represents â€śdoesnâ€™t knowâ€ť and 1 represents â€śknowsâ€ť. As a special

case, we put 0 in the diagonal entries: for the purposes of the

problem, nobody knows themselves.

(This isnâ€™t strictly necessary, but it makes the problem a bit easier

to solve.)

``````#     Bob Michelle Homer Johnny Marge Samuel ...

d = [[0,  1,       1,    1,     0,    0,     ...  ] # Bob

[0,  0,       0,    1,     1,    0,     ...  ] # Michelle

[0,  0,       0,    1,     1,    1,     ...  ] # Homer

[0,  0,       0,    0,     0,    0,     ...  ] # Johnny

...

]
``````

Now you look for the row that is all zeroes, or the column that is all

ones, except for the diagonal entry which will be a zero.

You can do this with the array module, but nested lists are easier.

1 Like