Tuples

A Python tuple is an ordered, immutable set of data items. By ordered, we mean that the items in a tuple are stored in a definite order, not necessarily that they are sorted. By immutable, we mean that the elements of a tuple cannot be altered or deleted, and the tuple itself cannot be extended or contracted.

It might seem that a tuple is just a crippled version of a list, but in fact, it appears from documentation that tuples and lists are intended to be used in different applications. Although Python is dynamically typed, so that data of different types can be mixed together within both lists and tuples, a list is primarily intended to be used for collections of data of the same type, much like an array in other languages. A tuple, on the other hand, is primarily intended to store a collection of related items, such as the name, date of birth and nationality of a person. Thus a tuple might be used to store data on one particular person, while data for all persons in a group might be stored as a list of tuples. Viewed in this way, a Python tuple is similar to a struct in C# or C++ (although data in structs can be altered, so the comparison isn’t entirely accurate).

A tuple can be defined by listing the data fields separated by commas and (optionally) enclosed in parentheses. Open a Python PowerShell and try the following:

tup = 1, 2, 3
tup

You’ll get (1, 2, 3) printed out. When displaying a tuple, the elements are always enclosed in parentheses, but the parentheses are optional when defining a tuple. Thus, we could have typed tup = (1, 2, 3) instead of the first line above.

To create a tuple with a single element you can type either tup = 1, or tup = (1,). The trailing comma is needed to tell Python that you want a tuple and not just a primitive int.

Accessing tuple elements

Although a tuple is immutable, we can access (in a read-only sense) its elements using the same subscripting and slicing syntax that we introduced in the discussion lists. Try the following in the PowerShell:

tup = tuple(x for x in range(12))
tup
tup[6]
tup[0:8]
tup[8:0]
tup[1:5:2]
tup[::-1]

The first line uses the tuple constructor to create a tuple by using a generator. This is similar to list comprehension that we looked at earlier, but in the case of tuples, we need to explicitly use the constructor name rather than just enclosing the generator in parentheses. That is, tup = (x for x in range(12)) won’t give you a tuple (try it!). It’s not an invalid statement though; rather it returns a generator object, which isn’t of much use here.

Returning to the above example, line 2 produces the tuple (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11). Line 3 produces the seventh element tup[6], which is 6 here. Line 4 returns a tuple consisting of elements 0 through 7 (remember that the ‘end’ index is the index after the last one included), so we get (0, 1, 2, 3, 4, 5, 6, 7). The expression tup[8:0] gives an impossible index range, so it returns an empty tuple ().

Next, tup[1:5:2] starts at tup[1] and proceeds to tup[4] in steps of 2, so we get (1, 3). Finally, tup[::-1] starts at the end and proceeds backwards through tuple in steps of 1, so it returns the tuple in reverse order: (11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0).

It’s important to note that, in all the cases where we used a slice (that is, one or two colons), the expression returns a new tuple containing the desired elements. The original tuple is unchanged.

Joining tuples

Tuples can be joined using the + operator. Consider:

tup = tuple(x for x in range(12))
xup = tuple(x for x in range(12, 20, 2))
tup
xup
yup = tup + xup
yup
yup += xup

This generates tup as the same tuple as before: (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11), and xup as the tuple starting at 12 and ending at 19, in steps of 2, so we have xup = (12, 14, 16, 18).

The statement yup = tup + xup generates a new tuple consisting of xup joined onto the end of tup, so we have yup = (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 18).

The last line might come as a surprise. It is a valid statement, and if you print out yup afterwards, you’ll get (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 18, 12, 14, 16, 18), so it seems that yup has been modified by appending another copy of xup. However, this statement doesn’t actually violate the immutability of tuples. What the statement yup += xup is doing is effectively the statement  yup = yup + xup. The right-hand side generates a new tuple which is the concatenation of yup and xup, and then this new tuple is assigned to the variable yup. In other words, yup now points at a brand new tuple and the old version of yup is lost. Immutability is preserved.

Packing and unpacking

The statement tup = 1, 2, 3 that we started with is an example of packing data into a tuple. Tuples allow their contents to be unpacked into individual variables as well. Consider:

tup = 1, 2, 3
a, b, c = tup
a
b
c

Line 2 unpacks tup into 3 separate variables, so we have a = 1, b = 2 and c = 3 afterwards. Unpacking requires that the number of variables on the left-hand side exactly matches the number of elements in the tuple.

Tuples and lists

You can convert a tuple into a list and vice versa by using the corresponding constructor. For example:

tup = 1, 2, 3
tupList = list(tup)
tupList[1] = 9
tup = tuple(tupList)
tup

We create the tuple (1, 2, 3) and convert it the list tupList. Since lists are mutable, we can change its element 1 to 9, and then convert it back to a tuple, which now reads (1, 9, 3). Although this sort of thing is possible, it’s probably not the best program design, since it indicates that you’re not using the right data type (tuple vs list) in the first place.

Exercise

Write a program that asks the user to input a sequence of shopping items where each item has a name and a price. The data for each item should be stored in a tuple, and the program should maintain a list of these tuples. When the user enters ‘quit’, the program should then calculate and print the total cost of the items in the list.

The cost of each item should be in the usual format of pounds and pence, or dollars and cents, so the numeric value of the cost will require 2 digits after the decimal point. Use the Decimal data type to store the cost.

See answer
from decimal import *
shopList = []
while True:
    item = input('Enter item\'s name and price separated by a comma, or \'quit\': ')
    if item == 'quit':
        break
    itemData = item.split(',')
    itemTup = itemData[0], Decimal(itemData[1])
    shopList += [itemTup]
cost = Decimal('0.0')
for tup in shopList:
    cost += tup[1]
print('Total cost =', cost)

After reading in the data on line 4, we check for ‘quit’. If we’re still adding an item, we split the input string on the comma on line 7, then store the two parts of the data in a tuple on line 8. Since a Decimal requires a string argument to store the value without roundoff error, we can pass itemData[1] directly to the Decimal constructor. We then add itemTup to shopList, but note that we need to enclose itemTup in square brackets [itemTup] so that the tuple gets added as a single element to the list shopList. If we omitted the brackets, the individual elements of itemTup would be added to shopList, and not the tuple.

After the user enters ‘quit’, control passes to the for loop on line 11, where we add up the costs of the items.

Named tuples

Since one of the main uses of tuples is the creation of structured data types for storing data fields of mixed, but fixed, types, having to refer to the individual data fields by their index number can lead to obscure program code that is hard to read. There is a special type of tuple called the namedtuple available in the collections module. We won’t go into great detail here, but will give a simple example of how you can define and use a named tuple.

from collections import *
Employee = namedtuple('Employee', ['name', 'age', 'nationality'])
empList = []
while True:
    empData = input('Name, age, nationality (or quit): ')
    if empData == 'quit':
        break
    empData = empData.split()
    emp = Employee(empData[0], int(empData[1]), empData[2])
    empList += [emp]
print('Employees\'s names:')
for x in empList:
    print(x.name)
    

Line 1 imports the collections module, which includes the definition of namedtuple. On line 2, we define a namedtuple called Employee (the object to the left of the =). The constructor for namedtuple takes 2 arguments. The first is a string which is the name of the data type (which we’re calling Employee). The second argument is a list of strings which define the names of the data fields in the tuple. Here, we’re storing the name, age and nationality of each employee. Note that, due to dynamic typing, we do not define the data types of these data fields.

The while loop on line 4 allows the user to enter the name, age and nationality of a number of employees. We’re not doing any error checking on the input, but in a real application you would, of course, need to do this.

On line 9, we create an Employee namedtuple called emp, and store the name, age and nationality in this tuple. Line 10 adds the tuple to the master list empList. Note that we want each entry in empList to be a tuple, so we need to enclose emp in brackets, as [emp]. Otherwise, each element within emp would be added to empList as a separate item, and the namedtuple would be lost.

Line 12 loops over the entries in empList to print out only the employees’ names. Note that on line 13 we can refer to the data field by using its label ‘name’ rather than having to use the index notation [0]. This makes the code clearer and easier to read.

Exercise

Rewrite the shopping list program from the first exercise so that it uses a namedtuple to store the name and cost for each item.

See answer
from decimal import *
from collections import *
Item = namedtuple('Item', ['name', 'cost'])
shopList = []
while True:
    item = input('Enter item\'s name and price separated by a comma, or \'quit\': ')
    if item == 'quit':
        break
    itemData = item.split(',')
    itemTup = Item(itemData[0], Decimal(itemData[1]))
    shopList += [itemTup]
cost = Decimal('0.0')
for tup in shopList:
    cost += tup.cost
print('Total cost =', cost)


The changes from the first version are:

  • importing the collections module on line 2
  • defining the Item namedtuple on line 3
  • creating an Item tuple on line 10 to store the data for a single item
  • using the name ‘cost’ of the data field rather than its index number on line 14

 

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.