{"id":344,"date":"2021-05-18T12:30:57","date_gmt":"2021-05-18T11:30:57","guid":{"rendered":"https:\/\/glennrowe.net\/programmingpages\/?p=344"},"modified":"2021-05-31T12:14:33","modified_gmt":"2021-05-31T11:14:33","slug":"sets","status":"publish","type":"post","link":"https:\/\/glennrowe.net\/programmingpages\/2021\/05\/18\/sets\/","title":{"rendered":"Sets"},"content":{"rendered":"<p>The <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">set<\/code> data type in Python implements most of the operations from mathematical sets. The most common representation of sets in mathematics uses Venn diagrams (overlapping circles) which I&#8217;m assuming are familiar.<\/p>\n<p>In Python, a <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">set<\/code> is a collection data type which is\u00a0<em>mutable<\/em> (can be altered), but whose elements must be\u00a0<em>immutable<\/em> data types, such as primitive numeric data types, strings and tuples. A set cannot contain duplicate values.<\/p>\n<p>The elements in a set are\u00a0<em>unordered<\/em>, which means they may not appear in the same order every time when a set is used as an iterator, for example, in a for loop. Because of this, it is not possible to access members of sets using numerical indexes or slicing (such as you would use to access members of lists and tuples).<\/p>\n<h2>Creating sets<\/h2>\n<p>A set can be created by listing its elements in braces (curly brackets):<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">mySet = {1, 2, 'wibble', 3.14}<\/pre>\n<p>As usual in Python, we can mix data types in a set.<\/p>\n<p>A set may also be created from a preexisting list or tuple by using the set constructor, as in:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">a = [1, 2, 3, 4]\r\nsetA = set(a)\r\nb = 'x', 'y', 'z'\r\nsetB = set(b)<\/pre>\n<p>If you print out <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setA<\/code> and <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setB<\/code>, you may find that the order of the elements is not the same as in the original list or tuple; this is because sets are unordered.<\/p>\n<p>As mentioned above, the elements of a set must be immutable data types. This means we can create a set of primitive data types like ints, floats, strings and complex numbers. As tuples are immutable, we can also create a set of tuples. However, we cannot create a set of lists, as lists are mutable (their elements can be changed, and they can be extended or contracted by adding or deleting elements).<\/p>\n<p>As we saw in the last example, however, you\u00a0<em>can<\/em> create a set\u00a0<em>from<\/em> the elements in a list, provided that the list elements are immutable. The list elements are extracted and inserted into the set.<\/p>\n<h2>Adding and deleting set elements<\/h2>\n<p>You can add an element to a set using the <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">add()<\/code> function, as in <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setA.add(44)<\/code>. For <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setA<\/code> as defined above, the new contents of <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setA<\/code> are now {1, 2, 3, 4, 44}.<\/p>\n<p>You can add multiple elements to a set using <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">update()<\/code>. The argument to <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">update()<\/code> can be an iterable type such as a list or tuple, but\u00a0<em>cannot<\/em> be a primitive type. Thus <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setA.update([45, 53, 77])<\/code> (adding a list) and <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setA.update((45, 53, 77))<\/code> (adding a tuple; note the double parentheses) are allowed, but <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setA.update(45, 53, 77)<\/code> is not. The argument to <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">update()<\/code> can also be another set, so we can have <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setA.update({45, 53, 77})<\/code>. This latter option is equivalent to forming the union of <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setA<\/code> with the set in the update function, and storing the result back in <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setA<\/code>.<\/p>\n<p>An element can be removed from a set using either <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">discard()<\/code> or <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">remove()<\/code>, both of which take a single argument which is the element to be discarded or removed. The two methods are equivalent if their argument is present in the set. If the argument is\u00a0<em>not<\/em> present, <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">discard()<\/code> will silently do nothing, while <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">remove()<\/code> will generate an error.<\/p>\n<h2>Set operations<\/h2>\n<p>The usual mathematical set operations are supported. Set union can be done using either the <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">|<\/code> (logical OR) operator or with the method <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">union()<\/code>, as in <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setA.union(setB)<\/code>. If we want the union of <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setA<\/code> and <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setB<\/code> above, we can write either <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setC = setA | setB<\/code> or <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setC = setA.union(setB)<\/code>. Note that if the same element occurs in both <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setA<\/code> and <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setB<\/code>, it appears only once in <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setC<\/code> because sets do not store duplicates. The union operation returns a new set that is the union of its two arguments, so the two original sets are not changed.<\/p>\n<p>Intersection can be done using either the <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">&amp;<\/code> (logical AND) operator or with the <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">intersection()<\/code> method. Thus we can have either <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setD = setA &amp; setB<\/code> or <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setD = setA.intersection(setB)<\/code>.<\/p>\n<p>The difference between <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setA<\/code> and <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setB<\/code> (that is, the set of element in <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setA<\/code> but not in <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setB<\/code>) is done using either the &#8211; (minus) operator or with the method <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">difference()<\/code>.<\/p>\n<p>The symmetric difference (elements in <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setA<\/code> or <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setB<\/code> but not in both) can be done using the ^ (logical XOR) operator or with the method <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">symmetric_difference()<\/code>.<\/p>\n<p>There are various methods that return a boolean value. The methods <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">isdisjoint()<\/code>, <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">issubset()<\/code> and <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">issuperset()<\/code> test to see if a set is disjoint (as in <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">setA.isdisjoint(setB)<\/code> has no elements in common), is a subset (all the elements of the first set are also in the second set) or is a superset(all the elements of the second set are also in the first set).<\/p>\n<h2>Exercise<\/h2>\n<p>Write a program that asks the user to enter some details about several people, including their name, age, salary and gender, where name and gender are strings, age is an int and salary is a <a href=\"https:\/\/glennrowe.net\/programmingpages\/2021\/04\/28\/decimals\/\">decimal<\/a>. Store the details for each person in a <a href=\"https:\/\/glennrowe.net\/programmingpages\/2021\/05\/15\/tuples\/\">namedtuple<\/a>, and add each namedtuple to a set.<\/p>\n<p>From this master set, construct sets containing each of the following. Use <a href=\"https:\/\/glennrowe.net\/programmingpages\/2021\/05\/12\/list-comprehension\/\">list comprehension<\/a> where appropriate to specify the elements of some of the sets.<\/p>\n<ul>\n<li>a set of all the males<\/li>\n<li>a set of all the females (feel free to expand these categories to allow for more genders if you like)<\/li>\n<li>a set of everyone under age 40<\/li>\n<li>a set of everyone age 40 or over<\/li>\n<li>a set of everyone with a salary under 10,000<\/li>\n<li>a set of everyone with a salary of 10,000 or more<\/li>\n<li>a set of all the males with a salary of 10,000 or more<\/li>\n<li>a set of all the females and everyone age 40 or over<\/li>\n<\/ul>\n<p>Print out each set. If you just use a print command on a set of namedtuples, the output isn&#8217;t exactly pretty, but it will do for now. Feel free to clean it up if you like.<\/p>\n<span class=\"collapseomatic \" id=\"id69f2ebbd0b8fb\"  tabindex=\"0\" title=\"See answer\"    >See answer<\/span><span id='swap-id69f2ebbd0b8fb'  class='colomat-swap' style='display:none;'>Hide answer<\/span><div id=\"target-id69f2ebbd0b8fb\" class=\"collapseomatic_content \">\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">from collections import *\r\nfrom decimal import *\r\nPerson = namedtuple('Person', ['name', 'age', 'salary', 'gender'])\r\npersonSet = set()\r\nwhile True:\r\n    data = input('Enter name, age, salary, gender (comma separators), or \\'quit\\':')\r\n    if data == 'quit':\r\n        break\r\n    data = data.split(',')\r\n    personSet.add(Person(data[0], int(data[1]), Decimal(data[2]), data[3]))\r\n\r\nmales = set([x for x in personSet if x.gender == 'm'])\r\nfemales = personSet - males\r\nunder40 = set([x for x in personSet if x.age &lt; 40])\r\nover40 = personSet - under40\r\nunder10k = set([x for x in personSet if x.salary &lt; 10000])\r\nover10k = personSet - under10k\r\nmalesOver10k = males &amp; over10k\r\nfemalesPlusOver40 = females | over40\r\n\r\nprint('Males:',males)\r\nprint('Females:',females)\r\nprint('Under age 40:',under40)\r\nprint('Over age 40:',over40)\r\nprint('Salary under 10k:',under10k)\r\nprint('Salary over 10k:',over10k)\r\nprint('Males over 10k:',malesOver10k)\r\nprint('All females &amp; everyone over age 40:',femalesPlusOver40)\r\n\r\n    \r\n\r\n<\/pre>\n<p>We define the namedtuple on line 3 and initialize the master set <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">personSet<\/code> on line 4.<\/p>\n<p>The while loop on line 5 reads in the data for each person, with the fields separated by commas (note that you will need to input the data separated\u00a0<em>only<\/em> by commas (with no additional whitespace) in order for the split(&#8216;,&#8217;) method on line 9 to work). On line 10, we add a namedtuple to the set <code class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">personSet<\/code>.<\/p>\n<p>After the user types &#8216;quit&#8217;, we build the sets starting on line 12. We use list comprehension to select all the namedtuples with male gender. The females set is the set obtained from the males set by taking the difference between the master set and the males set.<\/p>\n<p>Similarly, we construct the sets for the age groups and salary groups on lines 14 through 17.<\/p>\n<p>The males with a salary over 10,000 is formed by set intersection of the males set and over10k set. The set consisting of all females and everyone over age 40 is the union of females and over40.<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>The set data type in Python implements most of the operations from mathematical sets. The most common representation of sets in mathematics uses Venn diagrams (overlapping circles) which I&#8217;m assuming are familiar. In Python, a set is a collection data type which is\u00a0mutable (can be altered), but whose elements must be\u00a0immutable data types, such as [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4],"tags":[],"class_list":["post-344","post","type-post","status-publish","format-standard","hentry","category-python","entry"],"_links":{"self":[{"href":"https:\/\/glennrowe.net\/programmingpages\/wp-json\/wp\/v2\/posts\/344","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/glennrowe.net\/programmingpages\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/glennrowe.net\/programmingpages\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/glennrowe.net\/programmingpages\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/glennrowe.net\/programmingpages\/wp-json\/wp\/v2\/comments?post=344"}],"version-history":[{"count":4,"href":"https:\/\/glennrowe.net\/programmingpages\/wp-json\/wp\/v2\/posts\/344\/revisions"}],"predecessor-version":[{"id":348,"href":"https:\/\/glennrowe.net\/programmingpages\/wp-json\/wp\/v2\/posts\/344\/revisions\/348"}],"wp:attachment":[{"href":"https:\/\/glennrowe.net\/programmingpages\/wp-json\/wp\/v2\/media?parent=344"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/glennrowe.net\/programmingpages\/wp-json\/wp\/v2\/categories?post=344"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/glennrowe.net\/programmingpages\/wp-json\/wp\/v2\/tags?post=344"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}