Which data structure will be better to solve this question?

You just entered a party in which there are 𝑛 people, It is known that there is exactly one celebrity in the
party and you are determined to know him. You are permitted to ask the question “Do you know him?”.
Write an algorithm to find the celebrity in the minimum number of questions. We define a celebrity as
the one who is known to everyone in the party and who doesn’t know anyone.

I was recently asked this question in my CS exam, I wanted to know if it is possible to solve this question using array data structure. If yes, is it not easier to solve this problem using a graph ?

An approach to solve this question will be really appreciated.


You could solve it with either an array or a graph, depending how you
make them.

Think about the criteria for “celebrity”. What is it expressed in? It
looks like a pair of counts: how many people does this person know? How
many people know this person?

Initially you know nothing about anyone. Every time you ask the
question, what you learn can be expressed by adjusting the counters for
the person you’re asking, and the person you’re asking about.

How many counters are there? How might you store those counters so that
you can find them again to adjust them or query them? What data
structure might store those counters?

Cameron Simpson cs@cskk.id.au

Yes, you could solve this with the array module:


but nested lists would probably be better. Another solution would be to

use graphs. Here is a very old essay on how to work with graphs in



But I think that the minimum number of questions needed is to ask each

person in turn “Do you know anyone in this party?” and stop as soon as

someone answers “No”. That person is the celebrity.

*We define a celebrity as the one who is known to everyone in the party

and who doesn’t know anyone.*

There is exactly one such celebrity. So as soon as you find the person

who knows nobody, they must be the celebrity.

There can’t be two people who answer “No” to the question, because that

would mean that there is at least one person who doesn’t know the

celebrity, which is not allowed from the first condition that everyone

knows the celebrity.

So an easy way to solve this doesn’t require graphs, it just needs a

dict, strings and lists. Represent each person as a string (their name).

Create a dict with the keys being each person, and the values being a

list (or a set) of who they know:

d = {'Bob': ['Michelle', 'Homer', 'Johnny'],

     'Michelle': ['Marge', 'Johnny'],

     'Homer': ['Marge', 'Bob', 'Samuel', 'Johnny'],

     'Johnny': [],



So Johnny is the celebrity.

for person, people_they_know in d.items():

    if len(people_they_know) == 0:

        print("The celebrity is", person)


Assuming the data is correct, you don’t need anything more than finding

the person who knows nobody, but if you want to verify that the data is

consistent with the problem statement, you can then confirm that

everybody knows Johnny using a similar loop. But if the problem isn’t

lying to you, then this is unnecessary.

Another way is to use nested lists to make a communication matrix where

0 represents “doesn’t know” and 1 represents “knows”. As a special

case, we put 0 in the diagonal entries: for the purposes of the

problem, nobody knows themselves.

(This isn’t strictly necessary, but it makes the problem a bit easier

to solve.)

#     Bob Michelle Homer Johnny Marge Samuel ...

d = [[0,  1,       1,    1,     0,    0,     ...  ] # Bob

     [0,  0,       0,    1,     1,    0,     ...  ] # Michelle

     [0,  0,       0,    1,     1,    1,     ...  ] # Homer

     [0,  0,       0,    0,     0,    0,     ...  ] # Johnny



Now you look for the row that is all zeroes, or the column that is all

ones, except for the diagonal entry which will be a zero.

You can do this with the array module, but nested lists are easier.

1 Like