A bit late to the party movie night, but while others have done well to answer the “what” of classes (and OO), what I haven’t seen so much here—and what I really struggled with for a long time as a beginner—is the “why”. This is something I didn’t really understand for a long time when I was learning programming, at least taking a basic Java class (pun intended) that focused a lot on OO, but I rather picked up in practice after I got started helping out on a larger open source project (Spyder).
(Sidenote—I’ve found answering user questions here, as you are doing here, to be a great way to broaden and sharpen my knowledge. Conversely, I’ve found contributing to a decent-sized FOSS project to be a great way to deepen it).
To summarize, two really important and powerful features of classes—arguably, the primary motivating reasons they exist—is their inheritance and composibility.
Inheritance
Inheritance means that classes exist in a hierarchy; a child class can inherit a parent class’ attributes and methods, and either implement them differently or add some of its own. In your case, let’s tweak your Movie
class to add an abstract method play()
, which would play the movie:
class Movie:
def __init__(self, title, director, year):
self.title = title
self.director = director
self.year = year
def play(self):
# In reality you'd probably do this with `@abc.abstractmethod`; keeping it simple here
raise NotImplementedError
The play
method is abstract because it doesn’t have concrete implementation in Movie
. So, what good is it? Well, let’s say you implement a child class NetflixMovie
, and add the necessary logic to find and play the movie from Netflix:
class NetflixMovie(Movie):
def play(self):
results = netflix.search(title=self.title, year=self.year)
netflix.stream(results[0])
And supposed you also have a child class DisneyplusMovie
that implements the same for a movie on Disney+:
class DisneyplusMovie(Movie):
def play(self):
results = disneyplus.find(film_title=self.title, release_year=self.year)
disneyplus.start_stream(results[0])
Now, if you have a dict movies
as above, assuming movie
is some concrete subclass of Movie
(e.g. DisneyplusMovie
or NetflixMovie
) calling movie.play()
will play the movie from the streaming service its on (otherwise, if it is just an abstract Movie
, it will raise a NotImplementedError
):
movies['Scarface'].play()
You might (rightly) ask—Why do this, instead of just writing a function play_movie
and calling it if movie
is a Netflix movie? For example, instead of the subclasses, what if you did this (with Movie
having an extra type
attribute with the streaming service its on):
def play_netflix_movie(movie):
results = netflix.search(title=self.title, year=self.year)
netflix.stream(results[0])
def play_disneyplus_movie(movie):
results = disneyplus.find(film_title=self.title, release_year=self.year)
disneyplus.start_stream(results[0])
which could then be called as
if movie.type == "netflix":
play_netflix_movie(movie)
elif movie.type == "disneyplus"
play_disneyplus_movie(movie)
else:
raise Exception("Cannot play this type of movie.")
Two reasons:
-
First, on the caller side (the place play()
is actually called), you don’t need to worry about manually dispatching to the right function for the streaming service the movie is on (Netflix, Disney+, Amazon, etc)—that’s all done for you based on what class movie
is,. Furthermore, all that logic is contained in the movie subclass, rather than in separate free functions floating around, helping keep your code more logical, better organized and easier to refactor and avoid bugs—this is, in effect,separation of concerns.
-
Second, to your point about future flexibility, if you want to add a new type of movie—say, AmazonMovie
, you simply add the implementation code in that subclass, and the calling and dispatch code never needs to change—it will work handle the new subclass seamlessly, because it will work for all Movie
subclasses (that implement play()
)—it doesn’t have to care about any of the internal implementation details.
Composition
On the second major point, composition, I’ll be more brief, but it can potentially be even more powerful. With this, you can compose multiple objects inside one another to implement more complex functionality. For example, if you had a director class:
class Director:
def __init__(self, name, birth_year, country):
self.name = name
self.birth_year = birth_year
self.country = country
and you had some dictionary directors
of Director
instances keyed by name, you modify Movie
like so
class Movie:
def __init__(self, title, director, year):
self.title = title
self.director = directors[director]
self.year = year
Now, you can get the director’s country like so:
movie['Scarface'].director.country
This would be much more complex and tedious to implement as a series of functions.
Other advantages
One other benefit of a class is you get all the attributes of the thing you’re working with neatly bundled up into a single object to pass around to functions, even if you don’t actually define them as methods. For example, instead of passing around title
, director
and year
to search_movie
:
def is_on_netflix(title, director, year):
netflix.do_lookup_by_title(title=title)
...
you just pass the instance of Movie
instead, and search_movie
can access what it needs:
def search_movie(movie):
netflix.do_lookup_by_title(title=movie.title)
...
Also, it means you can do checks that title
, director
and year
and valid and the right type all in one place (the Movie
constructor, __init__
) rather than having to check validity in each function or risking something unexpected going wrong.
Also, versus just storing these as keys in a dictionary:
movie_dict = {"title": "Scarface", "director": "Brian De Palma", "year": 1983}
you can statically guarantee that title
, director
and year
exist and are the expected types (which using a dataclass, as below, makes especially easy). This becomes especially powerful if you use a static type checker like Mypy, or even are just checking your code manually, as it can help you spot all sorts of bugs that you never could with an arbitrary dictionary.
Dataclasses
On a final note, for classes that purely represent some data (attributes) without callable functionality (methods), consider a dataclass instead:
from dataclasses import dataclass
@dataclass
class Movie:
self.title: str
self.director: str
self.year: int
This is much simpler and cleaner to write, with less overhead (at least source code wise) than a regular class, and it gives you a bunch of useful dunder methods (__init__
, __eq__
, __str__
, __repr__
, etc) defined “for free”, while also allowing you to write your own if desired.