How to Effectively Use Python Classes
Posted on May 18, 2020 in Tips & Tricks
How to Effectively Use Python Classes
How to Effectively Use Python Classes
Let’s explore the powerful object-oriented programming concepts available to us in Python!
This article is part of the course
“Object-Oriented Python — for Beginners”. If you’re looking for a more beginner-oriented “Introduction to Python”
course, I created one just for you! This course takes things a step further and delves deep into
object-oriented Python. Earlier we saw
how the tuple
data type is used to
create a record. The class
can do the same, but
it also allows us to bundle functions alongside the data.
Python is an object-based programming language. Every data type, and every variable, is an object. But Python does not require us to use the object-oriented programming paradigm, we’ve done just fine so far without writing any classes.
Writing Classes in Python
Object-oriented programming has its own vocabulary which you’ll need to become familiar with:
- Class: a class is the computer code which defines the blueprint for objects. A class is a piece of code, often quite complex, written by a programmer like you.
Let’s define a class
called Book
.
- On line 5, we initiate the Book class blueprint.
- On line 10,
__init__
is the initializer which allows the instance variables of a new object to be initialized using the arguments given. - On lines 14–18 we initialize some instance variables. Some receive the value given during creation as arguments. Others are initialized to default values.
- We also define two instance
methods,
set_note()
and `Object: an object is something that you can manipulate with your program. It’s usually stored within a variable. Programmers create objects based on the blueprints defined in classes.
What’s going on here?
- First, on line 32, I create an
instance of the class Book named
mybook
and pass some values to the initializer - Then I
print()
the string representation of theBook
, and receive<class_01.Book object at 0x109dc8e20>
as the output. Python does not have a way to format this object as a string because we didn’t provide one. So Python is telling us of an object of classBook
at a specific address in memory.
Representing the Object as a String
Let’s fix the unsightly problem that Python doesn’t know how to represent our Book object as a string. We need to use inheritance to accomplish this.
Every user-defined object in Python
inherits the capabilities of all Python objects. Let’s see what they are by running the dir()
method in our interactive Python
terminal:
>>> dir(mybook)
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'author', 'goto_page', 'isbn', 'notes', 'page', 'set_note', 'title']
The dir()
built-in method prints a directory of all
of an object's functions. See all those double underscore method names? They’re inherited
from class
as are all Python classes. They’re
called dunder methods, short for “double underscore methods”. We already saw one in
our initializer __init__
.
We created a new __init__
initializer for our
Book class, thereby overwriting the inherited default.
Let’s doe the same for __repr__
which is called by print()
to get a string representation of the object:
When I execute this, I see the output:
Object id: 4543502176. Author: K Heckenlively. Title: Restoring Faith in ... Science. ISBN: 1510752242.
Instead of the previously cryptic
output. My own __repr__()
(pronounced: repper) function is being called, and not the one
inherited from class
. My method knows how to
properly represent this object as a string.
What is this id()
built-in method? Let’s imagine I created a
bunch of Book
objects all with the same initial
values. They’d look the same, even though they’re different. The id()
function gives us a unique id for every
unique object in Python — it points to the object’s location in memory. Adding the id()
output into the __repr__
function allows us to differentiate
objects which otherwise would look the same.
import pprint
books: list = [Book] # a list containing Book objects
for i in range(10):
books.append(Book(author="K Heckenlively", isbn="1510752242",
title="Restoring Faith in ... Science"))
pprint.pprint(books)
Gives us 10 different values for id()
because all 10 Book
objects exist at the same time in different
parts of computer memory:
[Object id: 4408108896. Author: K Heckenlively. Title: Restoring Faith in ... Science. ISBN: 1510752242.,
Object id: 4407866416. Author: K Heckenlively. Title: Restoring Faith in ... Science. ISBN: 1510752242.,
Object id: 4408366656. Author: K Heckenlively. Title: Restoring Faith in ... Science. ISBN: 1510752242.,
Object id: 4408365264. Author: K Heckenlively. Title: Restoring Faith in ... Science. ISBN: 1510752242.,
Object id: 4408365504. Author: K Heckenlively. Title: Restoring Faith in ... Science. ISBN: 1510752242.,
Object id: 4408366752. Author: K Heckenlively. Title: Restoring Faith in ... Science. ISBN: 1510752242.,
Object id: 4408365552. Author: K Heckenlively. Title: Restoring Faith in ... Science. ISBN: 1510752242.,
Object id: 4408365696. Author: K Heckenlively. Title: Restoring Faith in ... Science. ISBN: 1510752242.,
Object id: 4408365744. Author: K Heckenlively. Title: Restoring Faith in ... Science. ISBN: 1510752242.,
Object id: 4408367232. Author: K Heckenlively. Title: Restoring Faith in ... Science. ISBN: 1510752242.]
We have 10 Book
objects, all containing the same
information, but they are different objects to Python,
each consumes computer memory.
Instance Attributes
In the example above, in the class’s
__init__()
initializer, I defined the following
instance attributes:
self.isbn
self.title
self.author
self.page
self.notes
The use of self. is a dead giveaway
that the values in each attribute belong to a specific instance of an object. Setting an isbn
on one instance of aBook
object does not affect the isbn
value in any other instance of a Book
object.
Programmers most commonly use instance attributes in custom classes. Typically, they represent some aspect of a real-life thing.
Class Attributes
It is possible to define a class attribute which is shared amongst all instances
of objects created from this Class
blueprint.
This can be useful in some few cases.
One case is when you want to count all the objects of a class in existence. You can share a counter’s value amongst all instances by using a class attribute.
Another case is when you want to label every object of a class with a fixed version identifier. Let’s see this working.
What’s going on here?
- On line 4 we define the standard
__version__
variable. See PEP-396. - On line 14 we define a class attribute
__class_version__
which is set to the value of the module’s variable. - On line 13 we also define a class
variable named
counter
, initialized to0
. - On line 25 we access the object’s
Class
usingtype(self)
to increment the class variablecounter
by 1, when this object is created. - On line 26 we update the instance attribute self.sequencenumber to record which
- On line 28 I’ve introduced a new
concept!
@property
is a decorator. This decorator indicates that the following instance methodversion()
is a getter for the instance attributeversion
. Users of theBook
class can use this getter to retrieve a property attribute of the object. - On line 33 I’m cheating! Instead
of returning the value of an instance attribute,
I’ referencing the
Class
ofBook
usingtype(self)
to get the value of the class attribute named__class_version__
- On line 48, remember that __repr__() is used by print() to output a string representation of the object.
- On line 49,
{self.version}
uses the object’s property getter, which returns the__class_version__
value. - On line 50,
{Book.counter}
directly accesses the class attribute to return the number of objects ever created using the class blueprint.
Instance Properties and Encapsulation
So far, we’ve been directly accessing the object’s instance attributes. This is a bit like when you’re in a shop and you open up the box to take a look at a product and feel. You’re not supposed to open the box, but very rarely does anyone stop you either. So far, we’ve been opening the box and eating the sweets in the store! There is a better way, using encapsulation or data-hiding. We accomplish this in Python by:
- Using the prefix _ on our instance attributes to indicate that they belong to the instance’s scope, and are sort of private.
- Using the
@property
and the@<name>.setter
decorators:
What’s
going on here? Let’s take the example of setting and getting the object’s property author
.
- On line 22
author
is still being initialized with the argument passed to __init__() but now the value is being passed forward to the property setter… - On line 63 you can see the decorator for the
author
property’s setter function:@author.setter
, which is being called from line 22. Because the setter is an instance method, it takes self as the first argument. Using type hinting we hint that thevalue
being passed should be of typestr
. - On line 64, the object’s instance
attribute
_author
is being set to the value received by the setter, but any extraneous whitespace before or after is being stripped:self._author = value.strip()
. In this way, we can ensure that all values ofauthor
are being checked and verified by whatever rules we have put in place. - On line 59 you can see the
@property
decorator on the author function. This now becomes an instance attribute. Line 60 hints that the getter for the property namedauthor
will return astr
data type. - On line 61, my new and shiny “business rule” says that all values of author must do two things: firstly prepend the text “Ghost author” and then reverse the letters in the author field. We achieve this by using F-Strings and a reverse slice:
return f"Ghost Author and {self._author}"[::-1]
Instance Methods
In our Book class, we already have an instance method:
def set_note(self, note: str, page: int)-> int:
There are two things which indicate
that set_note
is an instance method: firstly it is a method definition,
it has the def
keyword. Secondly, the first
argument the method takes is self
. The Python
interpreter itself will populate self
with the
current instance object. In this way, the code in the instance method can use self
to access its own variables and data:
self.notes.append(note, page)
Instance methods are by far the most commonly seen in Python classes and you will likely use only these, most of the time. Instance methods provide functions which act upon the instance’s unique data and return results specific to this object.
The instance method set_note allows the
user of the class to set a reader’s note. Something like “on page 185, I completely disagree
with the author’s argument because my own experience shows the opposite!”. This note isn’t
appropriate for any other Book
object, it only
makes sense for this book.
Because instance methods are the most common and most expected kind of method used in Class blueprints, there is no special decorator to identify them.
Class Methods
In contrast to instance methods, class
methods are less commonly needed. They act upon all instances of the Class
at the same time. Class methods are
often used to define factory methods.
In our Book, we have a single
initializer defined by __init__(). In Python, it’s not possible to have multiple initializer, as
in Java or C#. What if we wanted to create empty books
using the Book class? Our initializer requires us to specify the isbn
, author
and title
. Let’s add a class method to our class
for this situation.
@classmethod
def from_empty(cls):
return cls(isbn=str(), author=str(), title=str())
What’s going on here?
- On line 91, we use the decorator
@classmethod
to indicate that the following method applies to all instances of the class. - On line 92, we don’t take self as an argument, we use cls which is supplied by the Python interpreter at run-time.
- 2On line 93, we call the
initializer of
cls
, which at run-time is the classBook
. We supply some empty values.
In this way, it is possible, in Python, for programmers to implement polymorphism. This allows an object to exhibit different behaviour in different circumstances.
Side comment: the __init__()
method is often called the constructor but this isn’t
strictly correct. To be absolutely precise, it is the initializer
of the object. In the example above, from_empty()
is a constructor, under the hood __init__()
is called to initialize the object.
Static Methods
For the sake of completeness, it is important to mention static methods. Static methods are attached to a class blueprint, but they are totally independent of either an object instance or the class itself.
So why bother? If a static method has nothing at all do with an instance object’s data, and in fact also has nothing to do with the Class object, the only good reason to use them is because it just makes sense to keep things together.
Let’s consider ISBN numbers. Validation of ISBN numbers is done through the use of a checksum. The final digit is in the range [0–9X]. The x is used if the checksum is 10.
Here is some Python code that can validate if a 10-digit ISBN number has a valid checksum:
It is
useful, it clearly belongs to the concept of books, but where to put it? It’s handy to put it in the
Book class, although it really operates on neither the Class
nor the object:
Class Hashability and Equality
The class concept is enormously powerful! But beware, it holds some pitfalls for the unwary. Let me list the three most common pitfalls:
- The print() function provides a
fairly useless representation. We covered this earlier, and how to fix it by overwriting the
__repr__()
function. - When comparing two objects of the
same class with the equality operator
==
we will never get them to be equal — unless we override the dunder (double underscore) method__eq__()
too. - When comparing two objects to
each other their hash value must also be equal. We also need to override the dunder
__hash__()
method. Let’s do this for ourBook
class.
What’s going on here?
- From line 27, I’ve overridden the
__eq__()
method. Python3 is smart enough to negate this so I don’t need to also override the__ne__()
method. The very first equality test is to ensure that the two objects being compared(self, other)
are of the same data type. Otherwise I’ve arbitrarily decided that the propertiesisbn
,author
andtitle
are enough to uniquely identify a book. (In actuality, you’d need to additionally know the edition of the book, but hey, I’m not a librarian!). - From line 33 I’ve implemented a
__key()
method. This returns a set of the object’s most important attributes. The__key()
method also helps when sorting objects in things like aSortedList
. - On line 36, I’ve overridden the
__hash__()
function. My hash function returns a hash of thetuple
returned by__key()
.
Don’t worry if you notice that every
time you execute the program you get new values for __hash__()
even though you haven’t changed the values of isbn
, author
or title
. This is expected.
By default, the__hash__()
values ofstr
,bytes
anddatetime
objects are “salted” with an unpredictable random value. Although they remain constant within an individual Python process, they are not predictable between repeated invocations of Python. Python Documentation.
_slots_ For Improved Performance
Python’s class objects use quite a bit of your computer’s memory. They come with a built-in dictionary as you may have noticed above.
It is easy to make your classes
lightweight by overriding the __slots__
list
with a list
of your instance attribute names. Under
the hood, Python will use this predefined list instead of creating a dictionary of instance
attributes. Objects built using this class will use less memory, meaning that you can have more
of them at the same time.
DataClass
There’s one last super-useful thing to
know about Python3’s classes. The @dataclass
and how it makes your life easier. So far we’ve ploughed through a lot of boilerplate. This is code which isn’t really useful
to anything except the Python interpreter. Enter the dataclass
.
Under the hood, the dataclass
generates all the
things that we’ve just learned how to create by hand. Let’s rewrite the Book
class as a dataclass
.
You can
see that the amount of boilerplate has been reduced dramatically. The dataclass
toolkit has created all of the code
needed to initialize a new object, compute equality and provide instance attributed behind the
scenes. This is very handy, but it does take away some flexibility.
- On line 16, I’ve implemented the
__post_init__(self)
method. This is called automatically by adataclass
, right after the__init__()
method.
Dataclasses sit somewhere between the
namedtuple
which is purely a
record, and a full-blown class
. They
allow the programmer to quickly create a simple class and add some class methods.
Dataclasses provide rich decorators for
use on either the Class
object or on class
attributes.
Next Steps
Well, that was a whirlwind! By now you should know a lot about classes, and how to create them. In a future story, we’ll cover inheritance, polymorphism and encapsulation in more detail :)