Sets#

  • Sets, similar to lists and tuples, are used to store multiple items in a single variable.

  • A set is a collection which is unordered and unindexed.

    • Set items are unchangeable, but you can remove items and add new items.


Type

Collection

Syntax

Ordered

Indexed

Mutable

Passed By

Duplicates Allowed

strings

characters

" "

value

list

any data type

[ ]

reference

tuple

any data type

( )

value

set

any data type

{ }

value

Creating a Set#

  • Sets are written with curly brackets { }

colors =  {"blue", "red", "green"}
colors2 = ["blue", "red", "green"]

print(colors)
print(colors2)
print(type(colors))
print(type(colors2))
{'blue', 'red', 'green'}
['blue', 'red', 'green']
<class 'set'>
<class 'list'>

Sets are Unordered#

  • Unordered means that the items in a set do not have a defined order.

  • Set items can appear in a different order every time you use them

  • Set items cannot be referred to by index or key.

a = {"blue", "red", "green"}
print(a)
{'blue', 'red', 'green'}

Immutable#

  • Set items are immutable: cannot be change the items after the set has been created.

  • After creation of set, its items cannot be changed

    • You can remove items and add new items.

  • Passed to functions by value

def func(a):
    a = {"blue", "red", "green", "yellow"}

a = {"blue", "red", "green"}

func(a)

print(a)
{'blue', 'red', 'green'}

Duplicates Not Allowed#

  • Sets cannot have two items with the same value.

  • Duplicate values will be ignored:

thisset = {"blue", "red", "green", "red", "red", "green"}

print(thisset)

print(len(thisset))
{'blue', 'red', 'green'}
3

Length of Set#

  • Get the Length of a Set

  • To determine how many items a set has, use the len() function.

  • Get the number of items in a set:

listt = ["blue", "red", "green", "red", "red"]
print(listt)
['blue', 'red', 'green', 'red', 'red']
listt = set(listt)
print(listt)
{'blue', 'red', 'green'}

Set Items - Data Types#

  • Set items can be of any data type e.g. str, int, float and bool

string = "aaab"
set(string)
{'a', 'b'}
set1 = {"blue", "red", "green",}
set2 = {1, 5, 7, 9, 3}
set3 = {True, False, False}
set4 = {"abc", 34, True, 40, "male"}

print(set4)
{'abc', 34, True, 'male', 40}

Set Operations: Add Set Items#

  • Once a set is created, you cannot change its items, but you can add new items.

  • To add one item to a set use the add() method.

listt = [1, 2, 3]
listt.append(4)
print(listt)
[1, 2, 3, 4]
string = "abc"
string = string + "d"
print(string)
abcd
thisset = {"blue", "red", "green"}
thisset.add("orange")
print(thisset)
{'blue', 'orange', 'red', 'green'}

Update Sets#

  • To add items from another set into the current set, use the update() method.

  • Add elements from color_1 into color_2:

color_1 = {"blue", "red", "green"}
color_2 = {"yellow", "indigo", "orange"}

color_1.update(color_2)

# color_1.add("yellow")
# color_1.add("indigo")
# color_1.add("orange")

print(color_1)
print(color_2)
{'blue', 'orange', 'yellow', 'indigo', 'red', 'green'}
{'indigo', 'orange', 'yellow'}

Remove Item#

  • To remove an item in a set, use the remove(), or the discard() method.

  • If the item to remove does not exist, discard() will NOT raise an error.

thisset = {"blue", "red", "green"}
thisset.remove("blue")
print(thisset)
{'red', 'green'}
thisset = {"blue", "red", "green"}
thisset.discard("blue")
print(thisset)
{'red', 'green'}
thisset = {"blue", "red", "green"}
thisset.clear()
print(thisset)
set()

Set Operations#

Set Operations - Union#

  • There are several ways to join two or more sets in Python.

  • You can use the union() or update() method that returns a new set containing all items from both sets

  • Both union() and update() will exclude any duplicate items.

set1 = {"a", "b" , "c"}
set2 = {1, 2, 3}

set1 = set1.union(set2)
print(set1)
print(set2)
{'c', 'b', 1, 2, 3, 'a'}
{1, 2, 3}

Set Operations - Intersection#

The intersection_update() method will keep only the items that are present in both sets.

Keep the items that exist in both set x, and set y:

x = {"apple", "banana", "cherry"}
y = {"google", "microsoft", "apple"}

x.intersection_update(y)
x = x.intersection(y)

print(x)
{'apple'}

The intersection() method will return a new set, that only contains the items that are present in both sets.

Return a set that contains the items that exist in both set x, and set y:

x = {"apple", "banana", "cherry"}
y = {"google", "microsoft", "apple"}

x = x.intersection(y)

print(x)
{'apple'}

Symmetric Difference#

  • Keep All, But NOT the Duplicates

  • The symmetric_difference_update() method will keep only the elements that are NOT present in both sets.

x = {"apple", "banana", "cherry"}
y = {"google", "microsoft", "apple"}

x.symmetric_difference_update(y)

print(x)
{'banana', 'google', 'cherry', 'microsoft'}

The symmetric_difference() method will return a new set, that contains only the elements that are NOT present in both sets.

Return a set that contains all items from both sets, except items that are present in both:

x = {"apple", "banana", "cherry"}
y = {"google", "microsoft", "apple"}

z =  x.symmetric_difference(y)

print(z)
{'banana', 'google', 'cherry', 'microsoft'}

Summary#


Type

Collection

Syntax

Ordered

Indexed

Mutable

Passed By

Duplicates Allowed

strings

characters

" "

value

list

any data type

[ ]

reference

tuple

any data type

( )

value

set

any data type

{ }

value

Exercises#

Write a function that accepts as input a list or string sequence and returns a set of unique items in sequence

# a = {}
a = set()
type(a)
set
def drop_duplicates(sequence):
    
    return 
    
    
assert drop_duplicates([1, 1, 2, 2, 2, 3, 3])=={1, 2, 3}
assert drop_duplicates("aaaabbbbccccc")=={"a", "b", "c"}
assert drop_duplicates([1, 2, 3])=={1, 2, 3}
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Input In [22], in <cell line: 6>()
      1 def drop_duplicates(sequence):
      3     return 
----> 6 assert drop_duplicates([1, 1, 2, 2, 2, 3, 3])=={1, 2, 3}
      7 assert drop_duplicates("aaaabbbbccccc")=={"a", "b", "c"}
      8 assert drop_duplicates([1, 2, 3])=={1, 2, 3}

AssertionError: