Encapsulation and private data

Programmers used to other object oriented languages such as C++, C# or Java will be familiar with the idea of private data fields and methods within a class. These private fields are to allow the programmer to implement encapsulation, which is the idea that the data (and some of the methods that deal with the data) should be accessible only within the class and should not be directly available to code written outside the class.

Before you proceed, you should have a general idea how to create a class and its constructor, so see here if you need a refresher.

For example, suppose we have a class representing a person, and we wish to store the person’s first, middle and last names in each object that is an instance of that class. If these data fields were public, then using code external to the class we could reassign these names wherever we wished. If the data fields were private, however, then we could control access to them by writing interface methods that allow the data to be changed or accessed only under certain conditions. We might want to place a check that the user isn’t trying to rename someone with an offensive name, or that the name contains only alphabetical characters, or even that the user has permission to access the data.

Python takes a more relaxed approach to encapsulation than what you might be used to in other languages. It’s not strictly possible to create totally private data fields or methods; there’s always some way of accessing them from external code. But Python does allow the programmer to make it abundantly clear that such data should be treated as private.

The convention in Python is that any class variable or method name that begins with an underscore character should be treated as private. If we use only a single underscore, the variable is, in fact, still just as publically available as any ordinary variable, so it’s up to the programmer to honour the request that it be kept private.

Access to private methods

Beginning a variable or method name with two underscores, however, does allow Python some control over access to it. It’s easiest to see all this with some examples, so here’s a sample Person class:

class Person:
    def __init__(self, nf = 'Glenn', nl = 'Rowe', nm = 'William'): 
        self.first = nf
        self._middle = nm
        self.__last = nl
    
    def PrintFirstName(self):
        print(self.first)

    def _PrintShortName(self):
        print(self.first + ' ' + self.__last)

    def __PrintFullName(self):
        print(self.first + ' ' + self._middle + ' ' + self.__last)

me = Person()
me.PrintFirstName()
me._PrintShortName()
me.__PrintFullName()

I’ve defined the class with a constructor that takes as arguments the first, last and middle names of the person, with default values provided for all three. The constructor assigns these to the three object variables first, _middle (with one underscore) and __last (with two underscores). We also define an ordinary method PrintFirstName() (with no underscores), a method _PrintShortName() (with one underscore) and a method __PrintFullName() (with two underscores). Following this we create a Person object called me and attempt to call each of the three methods.

If you try to run this code, you’ll find that the first two methods work properly, returning ‘Glenn’ and ‘Glenn Rowe’ respectively, but the program dies on the third method with the error ‘Person’ object has no attribute ‘__PrintFullName’. This illustrates Python’s way of implementing a private method. Since __PrintFullName() begins with two underscores, it is accessible only from within the class in which it is defined.

You’ll also notice that the call to _PrintShortName() did work, even though the print statement within that method refers to the variable self.__last, which has two underscores. This is acceptable, since this variable is being accessed by a method that is within the Person class, so it is visible.

[Actually, if you run the above code in the Python PowerShell, both me._PrintShortName() and me.__PrintFullName() give the ‘object has no attribute’ error. If you run the code within the Python console (within Visual Studio), only me.__PrintFullName() throws an error. I’m not sure what causes the incosistency, although I believe that the me._PrintShortName() statement should run without errors. In any case, if you’re respecting the programmer’s intent that any variable or method whose first character is an underscore (single or double) should be private, you shouldn’t be referring to them directly from outside the class anyway.]

To show that the method __PrintFullName() is visible from within the class, we can add a method as shown:

class Person:
    def __init__(self, nf = 'Glenn', nl = 'Rowe', nm = 'William'): 
        self.first = nf
        self._middle = nm
        self.__last = nl
    
    def PrintFirstName(self):
        print(self.first)

    def _PrintShortName(self):
        print(self.first + ' ' + self.__last)

    def __PrintFullName(self):
        print(self.first + ' ' + self._middle + ' ' + self.__last)

    def PrintName(self):
        self.__PrintFullName()

me = Person()
me.PrintFirstName()
me._PrintShortName()
me.PrintName()

The method PrintName() is a member of the Person class, so it can legitimately call the private method __PrintFullName(). This code runs without errors.

In practice, we might insert some extra code in PrintName() to check that the user has access to the data fields, or some other error-checking code.

Access to private variables

Suppose we now add the following lines at the end of the above program.

print(me.first)
print(me._middle)
print(me.__last)

You’ll find that the first two lines print out their arguments correctly, but the last line throws the error:  Message=’Person’ object has no attribute ‘__last’. Again, because __last begins with two underscores, it is not accessible from outside the class

Now suppose we delete these 3 lines and add these 3 lines instead:

me.__last = 'Wibble'
me.PrintName()
print(me.__last)

You might expect the program to die on the first line, since we’re attempting to assign a new value to a private variable. In fact, this program runs without any apparent errors, although the call to me.PrintName() still shows the last name as ‘Rowe’ and not ‘Wibble’. What’s happened?  To understand this, we need to look at what’s known as name mangling.

Name mangling

When a private variable or method is defined within a class, as with __last and __PrintFullName() above, it is actually assigned a ‘mangled’ name within the Python interpreter. The mangled name has the syntax _ClassName__variable. That’s a single underscore followed by the name of the class in which it’s defined, followed by the variable’s or method’s name (which starts with two underscores; variables or methods that don’t start with two underscores are not mangled). So the mangled version of __last above is _Person__last. When written out in full, this name is actually public and can be directly accessed by code outside the class.

However, when we write a statement like me.__last = 'Wibble' as in the above example, what is happening is that a brand new data field with the name __last is created and attached to the me object, so me now has a private data field with the full name _Person__last and a totally different data field with the name __last. The latter data field is actually completely public (since it was created outside the class definition), so the print(me.__last) statement above will print out its value.

However, the assignment me.__last = 'Wibble' has no effect on the private variable _Person__last, so the statement me.PrintName() prints out the person’s name unaffected by the assignment of me.__last = 'Wibble' in the previous line.

To verify all this, add these two lines after the previous three:

me._Person__last = 'Widget'
me.PrintName()

You’ll see that the private data field _Person__last now has been changed, and when the full name is printed out, the last name has been changed. Thus it is possible to access private data from outside the class in which it was defined, but it’s not recommended.

The moral of the story is this: a private data field or method can be defined only within the class definition (by prefixing the name with two underscores), and can’t be added externally. Once such a name has been defined, it can be accessed externally by prefixing the name with _<ClassName>.  As such access defeats the whole purpose of data encapsulation, it’s not something you should normally do. A better approach is to use interface methods, known as getters and setters.

Getters and setters

C# programmers will be familiar with C# properties, which allow private data fields to be read or written only indirectly. Such access can also be programmed in other OOP languages using the general idea of getter and setter methods. The basic idea is quite simple. We define a private data field such as __last above. We then write public methods to allow reading and writing of this variable. We might modify the Person class above like this:

class Person:
    def __init__(self, nf = 'Glenn', nl = 'Rowe', nm = 'William'): 
        self.first = nf
        self._middle = nm
        self.__last = nl

    def getLast(self):
        return self.__last

    def setLast(self, newLast):
        self.__last = newLast
    
    def PrintFirstName(self):
        print(self.first)

    def _PrintShortName(self):
        print(self.first + ' ' + self.__last)

    def __PrintFullName(self):
        print(self.first + ' ' + self._middle + ' ' + self.__last)

    def PrintName(self):
        self.__PrintFullName()

me = Person()
me.setLast('Roberts')
me.PrintName()
print('New last name:', me.getLast())

The method getLast() could include a check that the user is authorized to retrieve the value of __last, and setLast() could include an error check that the new name is valid, for example. By running the last 4 lines, you’ll see that the last name has been successfully changed from ‘Rowe’ to ‘Roberts’.

Exercise

Rewrite the Product class from the post on class definitions. The new version should have the following features:

  • The class should contain a name and properties dictionary, as before.
  • All data fields should be private (prefixed by a double underscore).
  • Define getters and setters for the various data fields, as appropriate.
  • The getters should just return the corresponding data fields.
  • The setter for the name should ask for a password before allowing the user to change the name of the Product object.
  • The password should be stored as a class data field in the Product class. [OK, this isn’t terribly secure, but I don’t want to make the program too complicated.]

The main program should create several Products as before, by giving a name and dictionary of properties for each Product. Then the user should be offered the option of changing the name of one of the existing products. The user should enter the name of the product to change. The program should then check that this name exists in the current product list and, if so, call that product’s setter to change the name. If the name change succeeds (that is, the user knows the password), the list of product names in the main program should be updated.

See answer

In the file Product.py:

class Product(object):
    """a product sold by the company"""

    __numProducts = 0
    __password = 'wibble'

    def __init__(self, name, **kwargs):
        self.__name = name
        self.__properties = kwargs
        Product.__numProducts += 1

    def getName(self):
        return self.__name

    def setName(self, name):
        pw = input('Enter password: ')
        if (pw == Product.__password):
            self.__name = name
            print('Name changed.')
            return 'success'
        else:
            print('Password incorrect')
            return None

    def getProperties(self):
        return self.__properties

    def getNumProducts():
        return Product.__numProducts

The name of each data field is made private by prefixing it with a double underscore. Getter methods are provided for the __name, __properties and __numProducts fields, but (obviously) not for __password!

The setName() method asks the user for a password and checks that it matches __password. If so, the product’s name is changed and a ‘success’ string is returned. Otherwise, the method returns the Python keyword None, which is just a void data field.

In the main program file, we have:

from Product import *

name = ['widget', 'frobule', 'thingy', 'glotz']
items = [
    Product(name[0], legs = 4, key = True, windup = True, unitCost = 10.50),
    Product(name[1], fuel = 'hydrogen', lifespan = 10, key = False),
    Product(name[2], legs = 10, unitCost = 0.95, warranty = True),
    Product(name[3], colour = 'red')
    ]


while True:
    choice = input('Enter 1 to change a product name or 2 to quit: ')
    if choice == '2':
        break
    elif choice == '1':
        print(f'We have {Product.getNumProducts()} products:', name)
        prodName = input('Enter product name: ')
        if prodName not in name:
            print('No such product')
        else:
            newName = input('Enter new name: ')
            nameIndex = name.index(prodName)
            product = items[nameIndex]
            result = product.setName(newName)
            if result != None:
                name[nameIndex] = product.getName()

A list of names and corresponding products is created on lines 3 to 9. If the user wishes to change a product’s name, they are asked to input this name on line 18. We then check if that name is in the name list. If so, we ask for the new name on line 22. We retrieve the product object by looking up its index in the items list on lines 23 and 24, then we call the object’s setName() method on line 25. The password check is done by this method and if successful, we update the name list on line 27.

 

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.