Floats and the math library in Python

Apr 26, 2021 by gwarowe

The float data type

A float number in Python is any real number. Unlike integers, however, floats are restricted in size. In memory, they are stored as 64-bit objects, in which one bit is used for the sign, 11 bits for the exponent, and the remaining 52 bits for the fractional part. The largest binary number using 11 bits is 2^{11-1}=2^{10}=1024, so the largest power of 2 that can be represented by a float is, theoretically, 1024. In reality, the largest float is a bit less than this. You can see the limits of the float data type in your version of Python using:

import sys
sys.float_info

With my installation of Python 3.7, this gives the result:

sys.float_info(
max=1.7976931348623157e+308,
max_exp=1024,
max_10_exp=308,
min=2.2250738585072014e-308,
min_exp=-1021,
min_10_exp=-307,
dig=15,
mant_dig=53,
epsilon=2.220446049250313e-16,
radix=2, rounds=1)

Thus the largest positive float is 1.7976931348623157e+308, and the smallest positive float is 2.2250738585072014e-308, with corresponding negative values. This is equivalent to double-precision floats in other languages.

Roundoff error

Because floats use a limited number of bits, they cannot store some real numbers exactly. For example, try entering the sum

0.1 + 0.2

into the console.

See answer
0.30000000000000004

The fact that the answer isn’t exactly 0.3 is due to way fractional parts are stored in floats.

As a result, it’s dangerous to test for equality of two float expressions. For example, the test

0.1 + 0.2 == 0.3

returns False, even though it is mathematically true. On the other hand a similar test such as 0.2 + 0.3 == 0.5 returns True. One solution to this problem is provided by the math library, so we’ll introduce that now and have a look at a few of its functions.

The math library and isclose()

To use the math library, insert the command import math at the start of your code, or type it into the console. You can get a complete list of functions available in the math library by typing help(math) into the console.

The function that can help us solve the roundoff error problem above is isclose(a,b,rel_tol,abs_tol). The function compares quantities a and b and returns True or False, depending on whether a is ‘close’ to b, given the criteria in rel_tol and abs_tol.

rel_tol is the relative tolerance, and is defined as rel_tol=\frac{\left|a-b\right|}{\text{max}\left(\left|a\right|,\left|b\right|\right)}. That is, the relative tolerance is the magnitude of the difference between a and b as a fraction of the larger of a or b. The default value of rel_tol is 10^{-9}, so if two floats are equal up to 9 or more significant digits, they are judged to be ‘close’.

abs_tol is an absolute tolerance and is defined as just abs_tol=\left|a-b\right|. This is useful for very small numbers near zero. Its default value is zero, meaning that a and b must be exactly equal. If the difference \left|a-b\right| passes the rel_tol test but not the abs_tol test, isclose() returns True. abs_tol is therefore used only if the rel_tol test returns False.

To experiment, try the following series of statements:

import math
a = 0.1
b = 0.2
c = 0.3
math.isclose(a + b, c)
math.isclose(a + b, c, rel_tol = 1e-15)
math.isclose(a + b, c, rel_tol = 1e-16)
math.isclose(a + b, c, rel_tol = 0, abs_tol = 1e-16)
math.isclose(a + b, c, rel_tol = 0, abs_tol = 1e-17)

See answers
math.isclose(a + b, c) : True
math.isclose(a + b, c, rel_tol = 1e-15) : True
math.isclose(a + b, c, rel_tol = 1e-16) : False
math.isclose(a + b, c, rel_tol = 0, abs_tol = 1e-16) : True
math.isclose(a + b, c, rel_tol = 0, abs_tol = 1e-17) : False

Note that rel_tol returns False with a value of 1e-16, while abs_tol drops to 1e-17 before it returns False. You can see why by calculating, first, (a + b - 0.3) / a, which gives 5.551115123125783e-16, then (a + b - 0.3) / b, which gives 2.7755575615628914e-16. Both of these are greater than 1e-16 but less than 1e-15, so the rel_tol test returns True for 1e-15, but false for 1e-16.

Now calculate a + b - c, which gives a value of 5.551115123125783e-17. This is less than 1e-16 but greater than 1e-17, so if we switch off rel_tol by setting it to zero, then abs_tol returns True for 1e-16 but False for 1e-17.

Other math functions

The math library contains all the usual math functions such as trig functions and inverse trig functions, hyperbolic functions, exponentials and logarithms. Have a look at help(math) to see the full list. We’ll probably return to these later when we need them.

 

By gwarowe

Leave a Reply

Your email address will not be published. Required fields are marked *