Lists: the basics

The Python list is a powerful data type with a lot of functionality, so we’ll cover it in several posts. Here we start with some of its basic properties.

First, it’s important to understand that Python supports four different built-in compound data types: lists, tuples, sets and dictionaries. Each of these has its own set of rules and peculiarities.

A list is an ordered, mutable collection of data elements. It allows duplicate elements, that is, you can have 2 or more elements of the same datum in a list.

By ordered, we mean that the items in a list are stored in a definite order, so that, if we access say the fourth element of the list, it will always be the same datum (unless we’ve modified the list in the meantime). It’s important to note that ‘ordered’ does not necessarily mean ‘sorted’, so, for example, numeric values in a list need not be in ascending order (although it is possible to sort the items in a list, but that’s a different operation, as we’ll see).

By mutable, we mean that the contents of a list can be changed, either by adding or removing elements, or by changing the data stored at an existing list position.

As Python is a dynamically typed language, the elements in a list need not all be of the same data type, so we can mix ints, floats, strings and even other lists as elements within a given list.

At this stage, it’s useful to play around with lists in the PowerShell, so start one up and type ‘python’ at the prompt to enter the Python interpreter.

A list is specified by typing its elements, separated by commas, between square brackets. For example:

x = [1,2,2,3,4,4,5]

If you now type x at the prompt, you’ll just get [1,2,2,3,4,4,5] as the response.

To access a particular element within a list, give the list’s name followed by the element’s position in square brackets. The elements of a list are indexed starting with 0 for the first item, so x[2] gives the third element. Typing x[2] after the above gives us 2.

We can use the same notation to change a list element. Try typing x[2] = 'wibble' followed by x. You’ll see the list is now [1, 2, 'wibble', 3, 4, 4, 5]. This illustrates that you can mix data types within a list.

Now try y = [x, 3.14, -7+4j], and then type y to see what y looks like. You’ll see that it is [[1, 2, 'wibble', 3, 4, 4, 5], 3.14, (-7+4j)]. The list x is now the first element of y, followed by the float 3.14 and the complex number -7+4j.

If you want a single list consisting of the elements of x followed by 3.14 and -7+4j, you can use the + operator: z = x + [3.14, -7+4j]. If you print out z, you’ll see that it is [1, 2, 'wibble', 3, 4, 4, 5, 3.14, (-7+4j)]. Note that the inner pair of brackets around the first 7 elements is now absent, so all the primitive data elements are part of a single list. If you want to append a list to an existing list, you can use the += operator, as in x += [3.14, -7+4j].

The number of elements in a list x can be obtained using len(x).

Exercises

1. Write a program that generates 10 random floats between 0 and 1 and stores them in a list. Given that, for a list x, the statement x.sort() sorts the list into ascending order, print out both the original list and the sorted list. You’ll need the random() function from the module random to generate the random numbers.

See answer
from random import *
ranList = []
for i in range(10):
    ranList += [random()]
print('Unsorted list:')
print(ranList)
ranList.sort()
print('Sorted list:')
print(ranList)

2. You’ll see from the above exercise that running the sort() function on a list replaces the list with its sorted version. If you want to retain the original list as well as have a sorted version of the list, you’ll need to copy the list before you sort it. Investigate the copy() function for a list, and modify the above program so that both the original list and its sorted version are present at the end (so you can print them both out after doing the sort).

See answer
from random import *
ranList = []
for i in range(10):
    ranList += [random()]
ranCopy = ranList.copy()
ranList.sort()
print('Unsorted list:')
print(ranCopy)
print('Sorted list:')
print(ranList)

Accessing and changing list elements

As mentioned above, a single list element can be accessed or changed by specifying its index, as in x[2]. We can also use a negative index to count backwards from the end of the list, so x[-2] is the second element from the end. When counting backwards, the last element of the list has index -1, the second last by -2 and so on. When counting forwards, the index starts at 0.

We can also specify a range of indexes from within a list. The notation z[m:n] returns a new list containing elements m…n-1 from the original list. With the list z = [1, 2, 'wibble', 3, 4, 4, 5, 3.14, (-7+4j)], z[1:3] is the list [2, 'wibble'] (that is, the elements 1 and 2 from the original list). It’s often confusing that the index n in z[m:n] is not the index of the last element returned; rather it is the index of the element after the last element returned.

This sublist notation has a couple of shorthand versions. The list z[:3] returns elements 0 through 2, z[3:] returns elements 3 through to the end of the list, and z[:] returns a copy of the entire list (so this is an alternative to the copy() function for copying a list).

You can use negative indexes in sublist expressions, so that z[-3:-1] returns sublist starting with the element third from the end and ending with the element just before the last element (remember that the last element has index -1, and the sublist notation returns elements up to, but not including, the element with the second index).

If you give an impossible combination of indexes, such as z[4:2], you just get back an empty list (rather than an error message, so be careful!). Also, if one of the indexes in a sublist is out of range (that is, no such element exists) you’ll still get a valid list returned, rather than an error. Specifying a single list element that is out of range, however, will give you an error. It can all be somewhat confusing, so just take care!

A slice of a list can be modified using the colon notation. Consider the following:

xList = []
for x in range(10):
    xList += [x * x]
print(xList)

xList[3:5] = ['a', 'b', 'c', 'd']
print(xList)

xList[5:6] += [1.2, 2.4, 4.5]
print(xList)

The first print statement gives us a list of squares from 0 to 9: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]. The expression xList[3:5] = ['a', 'b', 'c', 'd'] replaces list elements 3 and 4 with [‘a’, ‘b’, ‘c’, ‘d’], so xList is now [0, 1, 4, ‘a’, ‘b’, ‘c’, ‘d’, 25, 36, 49, 64, 81]. Note that when replacing a slice of a list with another list, the numbers of elements in the two lists need not be the same. In this example, we replaced 2 elements with 4 elements, so the new list expanded by 2 elements.

We can use the += operator to add some elements to a location in the list, as the last example above shows. In this case the location 5 (currently the element ‘c’) has elements [1.2, 2.4, 4.5] added to it, so the new list is now [0, 1, 4, ‘a’, ‘b’, ‘c’, 1.2, 2.4, 4.5, ‘d’, 25, 36, 49, 64, 81].

Exercise

In the last example above, since we are adding elements to a particular location in the list, you might think we could replace xList[5:6] += [1.2, 2.4, 4.5] with xList[5] += [1.2, 2.4, 4.5]. Try it, and explain why it doesn’t work.

See answer

While xList[5:6] refers to a slice of a list, the expression xList[5] refers to a single element within the list, so its data type is whatever the data type of that element is. In this case, xList[5] is ‘c’ (a string), so the expression xList[5] += [1.2, 2.4, 4.5] attempts to concatenate a string with a list of floats, which isn’t a valid operation.

The double colon operator

The single-colon version of list slicing discussed above has a more general form, in which we can specify not only the start and end points of the slice, but also the step size to be used when selecting elements in the slice. Consider the following:

xList = []
for x in range(10):
    xList += [x * x]
print(xList)

print(xList[2:6:2])

print(xList[-2:3:-1])

print(xList[-2:-3:-1])

print(xList[::-1])

We start with a list of squares from 0 to 9 as before, so xList starts as [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]. The expression xList[2:6:2] starts at location 2 and ends at location 6-1=5, but proceeds in steps of 2, so elements 2 and 4 are selected. The resulting list is therefore [4, 16].

The second example xList[-2:3:-1] starts at location -2, which is the second element from the end, and proceeds to the element just before 3, in steps of -1. The slice therefore proceeds backwards, starting at element xList[8]. The ‘element just before 3’ in this context is the last element we encounter when counting down to 3, so it is element 4, which is xList[4], or 16. Thus xList[-2:3:-1] gives us the list [64, 49, 36, 25, 16].

The third example xList[-2:-3:-1] again starts at the second-to-last element (element 8) and counts down in steps of 1, but ends at the element just before the third-to-last element (element 7). Thus this list contains only one element, which is xList[8], so we have [64].

The last example xList[::-1] contains only a step size of -1, meaning we step backwards through the list. When either the start or stop index is missing, the default value is to take the location that maximizes the portion of the list that is traversed. In this case, since neither start nor stop is specified, the entire list is processed, and, since we step backwards, this gives us the list in reverse order, so we have [81, 64, 49, 36, 25, 16, 9, 4, 1, 0].

Exercise

Predict (without running the code) what each print statement will print in the following code:

xList = []
for x in range(10):
    xList += [x * x]
print(xList)

print(xList[5::-1])

print(xList[:5:-1])

print(xList[2:2:1])

print(xList[-2:2:1])

print(xList[-8:-3:2])

print(xList[::1])
See answer

The first print just gives us our list of squares: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81].

xList[5::-1] starts at xList[5] and steps backwards in steps of 1. Since no stop index is given, the slice goes to the beginning of the list, so we have [25, 16, 9, 4, 1, 0].

xList[:5:-1] has an end index of 5, and a backwards step of size 1. Since no start index is given and we are stepping backwards through the list, we start at the end of the list and proceed down to the last element before element 5, which is xList[6]. The list is therefore [81, 64, 49, 36].

xList[2:2:1] starts at xList[2] and proceeds forward in steps of 1. However, the end index is 2, which means that the slice ends at the element just before xList[2], that is, xList[1]. Such a slice is impossible, so we get an empty list [].

xList[-2:2:1] starts at the second last element which is xList[8] and steps forwards from there. The end point is given as index 2, which again cannot be reached by stepping forwards from element 8, so we again have an impossible slice and get an empty list [].

xList[-8:-3:2] starts at the eighth-to-last element xList[2] and proceeds forwards in steps of 2 up to the third-to-last element xList[7], so we get [4, 16, 36].

xList[::1] steps forward in steps of 1. Since neither endpoint is specified, the entire list is processed, so we get the original list [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]. Thus xList[::1] is equivalent to a copy of  xList .

One final note about list slices: a list slice produces a new list containing the elements selected in the slice. In particular, the last example above xList[::1] produces a copy of the original xList. Consider:

xCopy = xList[::1]
xCopy[3] = 'wibble'
print(xList, xCopy)

We would find that xList has not changed, but xCopy is a modified version of xList, so the two lists are now xList == [0, 1, 4, 9, 16, 25, 36, 49, 64, 81] and xCopy == [0, 1, 4, ‘wibble’, 16, 25, 36, 49, 64, 81].

Lists and for loops

A list can serve as a source of elements processed by a for loop. For example:

z = [1, 2, 'wibble', 3, 4, 4, 5, 3.14, (-7+4j)]
for elem in z:
    print(type(elem))

This program prints the data type of each element in the list, so we get

<class ‘int’>
<class ‘int’>
<class ‘str’>
<class ‘int’>
<class ‘int’>
<class ‘int’>
<class ‘int’>
<class ‘float’>
<class ‘complex’>

Exercise

Write a program that asks the user for an integer numRands, then generates a list containing numRands random numbers between 0 and 1. You should sort this list (no need to retain a copy of the original list). You should then save the portion of the list with numbers > 0.5 in a new list, print out the length of this list, and then calculate the average value of all the numbers in that list. If the random number generator is working properly, you’d expect the length of the sublist to be about half of numRands, and the average value of numbers > 0.5 to be around 0.75.

There’s no need to check the user’s input is valid (although you can if you want to). You should include tests to handle the cases where either all the random numbers are < 0.5 or > 0.5.

See answer
from random import *
while True:
    numRands = input('How many random numbers (or \'quit\')? ')
    if numRands == 'quit':
        break

    numRands = int(numRands)        # Number of random numbers to test
    ranList = []
    for i in range(numRands):       # Generate the list of randoms
        ranList += [random()]
    ranList.sort()                  # Sort it
    count = 0
    ranLength = len(ranList)
    while count < ranLength and ranList[count] < 0.5:     # Find the first element > 0.5
        count += 1
    upperHalf = ranList[count:]     # Create the sublist of elements > 0.5
    print('Number > 0.5: ', len(upperHalf))
    if len(upperHalf) > 0:
        average = 0.0
        for num in upperHalf:           # Calculate the average of the sublist
            average += num
        average /= len(upperHalf)
        print('Average of upper half =', average)
    else:
        print('No elements > 0.5')


Hopefully, the comments explain what the code is doing. The condition count < ranLength on line 14 checks for the case where all random numbers are < 0.5 (which can happen if numRands is small), and the test on line 18 checks for the case where all the numbers are < 0.5 (so the sublist is empty), which again can happen if numRands is small.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.