Learn Programming with Python — Introduction to Compound Data Types: Sets and Tuples
Posted on May 03, 2020 in Learn Python
Learn Programming with Python — Introduction to Compound Data Types: Sets and Tuples
Learn Programming with Python — Introduction to Compound Data Types: Sets and Tuples
Let’s take a close look at compound data types in Python. Sets and tuples allow us to build and use richer, more expressive programs.
The Set Data Type in Python
In a previous instalment, I introduced you to
the set
compound data type. A set is an
unordered collection of objects which share something in common.
Here is a set
, which we can iterate over by passing it to a
for
loop statement:
The set
data type has some amazing capabilities!
These are derived directly from the set
theory branch of mathematics. You’ll be most familiar with this from what you
already know about Venn diagrams.
Let’s create a bunch of sets and see how they work in Python.
In my code editor, I see the following:
What’s going on here?
- On lines 1, 2, and 3 I each define
a
set
(using a:set
type hint) containing related elements. (Don’t shoot me if you’re a botanist, I’m doing my best!). - On line 6 I’m printing out the
intersection
oftreefruit
andcirtusfruit
. This returns all elements appearing in both sets. - On lines 12 and 13 I’m also printing intersections.
- On line 15 I’m creating the
union
— all the elements of two sets combined into a new set. - On line 18 I’m finding the
difference
— which of thestonefruit
andtreefruit
are notcitrusfruit
.
We can modify the set in-place using
the methods pop()
, remove()
and discard()
. pop()
returns a random element in the set,
remove()
will remove the given element from the
set but fail with an error if it wasn’t present, and discard()
will silently remove the given element but not fail with an error if it wasn’t present. See the
following screenshot of me producing a runtime error on line 9 by using remove()
on the set using a fruit which was
already popped off on line 4.
Because
the Python set
is based on set theory, programmers often use sets when they wish
to test for membership of an element in multiple collections, or simply as an easy way to remove
duplicates from a collection.
Python also offers us the frozenset
compound data type. Frozensets can’t
have their elements modified using discard()
,
for example.
The set() Constructor
So far, we have only been using curly brackets to create a new set:
citrusfruit:set = {"oranges", "lemons", "limes", "satsumas", "nectarines"}
However, the built-in function set()
can create a new set based on the argument
given. This is called a constructor because it is used
to construct, and to return, new object. For example, we can create a new set based on a string
(remember: a string is a sequence of characters) using set()
:
characters:set = set("The quick brown fox jumped over the lazy dog.")
print(len(characters))
How many unique characters are in the sentence? One for every letter in the English alphabet, a space, and a period: 28.
The Tuple Data Type in Python
The tuple
compound data type contains one or more
comma-separated elements which, in total, are considered to form a record. Programmers often use tuples when a single
value is not enough to identify something. Like the address of a house! In computer science, we
might formally say something like: the identity of any house
object is comprised of its attributes (street name, house number, city, postal code, and
country). These 5 attributes of the house’s identity can easily be used in a tuple with
5 elements — an address.
In a previous instalment, we considered using a tuple with two elements to represent the geolocation of a point on the Earth’s surface, using its latitude and its longitude.
geolocation = (48.858093, 2.294694) #The Eiffel Tower
When we created the set
data type, we enclose its values in curly
brackets {}
. When we create a tuple
, we use standard brackets ()
to enclose the values.
After we’ve created a tuple
, we can access its elements using their index. The index always begins at 0, and is always
indicated by the use of square brackets acting on a variable such as geolocation[0]
.
>>> geolocation = (48.858093, 2.294694) #The Eiffel Tower
>>> print(f"Latitude: {geolocation[0]} Longitude: {geolocation[1]}")
Latitude: 48.858093 Longitude: 2.294694
An interesting thing about the tuple
is that, once it has been created, it can’t
be modified. This special property is called mutability,
every tuple
is immutable. Attempting to change the value at index
0 of our tuple
will result in a TypeError
:
>>> geolocation[0] = 50.0000
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
Python has a secret, hidden in the
collections
module! Remembering which index in
a tuple contains which value can be a real PITA. Was latitude at index element 0, or was that
longitude? Let’s use a named tuple to resolve the
issue! To use the namedtuple
data type, we
first need to import the capability contained from the collections
module.
Here is what this looks like in my code editor:
It is really simple to convert a tuple to a set (losing any duplicates) or a set into a tuple:
After executing this in my code editor, I get:
What’s going on here?
- On line 1 we defined a variable
named geolocation, and hint that we intend this to be a tuple
geolocation:tuple
. We use the normal brackets to create a comma-separated list of the tuple’s elements. Note that two elements have the same value. - On line 2 we have defined a variable named students, and hint that we intend this to be a set. We use curly brackets to create a comma-separated list of the set’s elements.
- On line 4 we use the
set()
function with thegeolocation
tuple as its argument. We print the newset
— which has only a single unique value. Note that the output uses curly brackets to indicate that this is aset
. - On line 5 we use the
tuple()
function with the studentsset
as its argument. We print the newtuple
, it contains all of the elements of theset
. Note that the output uses normal brackets to indicate that this is atuple
.
A note on type hints. In other programming languages, it is not normal to hint at the intended data type of the variable we’re creating. It is either a hard requirement (e.g. Java), or it is seen as pointless (e.g. Javascript). Even though on line 1 we hinted that the geolocation variable is intended to be a tuple, Python does not prevent us from assigning a set to it on line 4. Type hints were introduced in Python 3.5, the documentation gives us the very helpful text:
Note
The Python runtime does not enforce function and variable type annotations. They can be used by third party tools such as type checkers, IDEs, linters, etc.
Tuples can Contain Other Tuples!
You may be familiar with how to specify a colour’s name and its RGB value. Here is a handy table of common HTML colour names and their RGB values using hexadecimal notation. Let’s make this Pythonic, using tuples!
In my code editor, this is the output I get:
What’s going on here?
- First and foremost, notice my typo on line 21. I wrote “Nane” and not “Name”. Everyone makes mistakes! Feel free to fix it for me ;)
- On line 1 we’re importing
the
collections
module again. I’ve grown to like it! - On lines 3 and 4 we’re defining
two new namedtuples. We’re also specifying that index[1] is named “rgbvalue”. At this index
we will be storing an entire tuple, within the
tuple
. - On line 6 we’re defining a
set
which will hold all the tuples we’ll define next. - On lines 8, 11, 14, and 17 we
create a variable called
htmltuple
. We assign a newnamedtuple
to this variable. We create the tuple value by calling the constructorhtmlcolour
and passing it the values we want. The first argument contains the name of the colour. The second argument contains a newnamedtuple
, this time constructed from thergbvalue
namedtuple
defined on line 3. - Notice that I’m using red=, blue=, and green= instead of the indices. Also noticed that I’ve changed the order around a little to prove that using named indices works fine.
- You’ve noticed that the RGB value
use the hexadecimal system, where hex FF is equal to decimal 255. In Python, we can directly
use the hexadecimal values by prefixing them with the symbol
0x
. - On lines 9, 12, 15, and 18 we add
the new tuple to the set called
colours
. - On lines 21 to 25 I’m printing out (for each element in the set of colours) some user-friendly information. This line was too long for my editor, so I broke it up into smaller lines using multiple F-Strings.
- Notice that I’m using dot
notation
colour.rgbvalue.blue
to keep digging deeper into the tuples. I think this is more programmer-friendly than writingcolour[1][2]
to reach the value of blue!
What have we Achieved?
Really, such a lot! If you’ve got this far you should be really proud of your achievement! I hope you’re having fun.
- We’ve discussed sets and tuples, again.
- We’ve considered the constructor, used to create new objects.
- We’ve considered mutability, the property of an object which tells us if it is modifiable or not.
- We’ve explored type hints, noting that they are nothing more than a helpful hint.
- We’ve looked at the
frozenset
, an immutable variant of theset
data type. - We explored using indexes to get to a specific element in a tuple.
- We’ve used the
namedtuple
data type to write tuples in a more readable way. - We’ve learned about casting, or converting, between sets and tuples.
I hope you enjoyed this! If you spotted
any errors please do let me know! In the next instalment in this series we’ll explore two
extremely useful compound data types, list
and
dict
.
Articles in this series so far:
- Learn Programming with Python — An Introduction
- Learn Programming with Python — Introduction to Functions
- Learn Programming with Python — Controlling Execution Flow
- Learn Programming with Python — Introduction to Data Types: Strings
- Learn Programming with Python — Introduction to Data Types: Numbers
- Learn Programming with Python — Introduction to Compound Data Types: Sets and Tuples
- Learn Programming with Python — Introduction to Compound Data Types: Lists
- Learn Programming with Python — Introduction to Compound Data Types: Dictionaries