6.1. Native Data Types#
All computer programs are designed to process data and the atomic unit of all data is a value.
In Python, EVERY value is an object. This means that every value satisfies the following three properties that every object must satisfy:
Every object has a type. You can check a variable’s data type using the built-in function
type
Every object has an internal data representation.
Every object has a set of procedures for interacting with it.
Every value is an instance of a type. For example, the value 3
is an instance of the type int
and the value 3.0
is an instance of the type float
.
type(3), type(3.14), type('3.14'), type(True), type(None)
(int, float, str, bool, NoneType)
Even type is an object in Python. The type of int
is type
and the type of type
is type
.
a = type(3)
type(a)
type
Python comes pre-packaged with two broad categories of data types: elementary data types and compound data structures.
Elementary data types are the basic data types that are built into the Python language. These types are used to represent single values, and include:
Integers
int
Floats
float
Booleans
bool
Strings
str
None
NoneType
Compound data structures are used to group together multiple values. These types are used to represent collections of data.
The procedures associated with an object are individually called methods and collectively constitute the interface for manipulating the object.
In the context of object-oriented programming, procedures that are associated with objects are called methods and data that associated with an object is called an attribute.
Before we define and create our own classes, let’s first look at the built-in data structures in Python.
Data Structures, as the same suggests, are structures designed for organizing, processing, manipulating, retrieving and storing data.
Most programming languages provide a set of native data structures to: 1. Facilitate commmon programming tasks 2. Improve programming efficiency for the programmer 3. Improve time and memory efficiency through optimized internal implementations
6.1.1. Compound Data Types#
Five most commonly used compound data structures that come built-in in Python include:
Lists
list
rainbow = ["Red", "Orange", "Yellow", "Green", "Blue", "Indigo", "Violet"]
Methods:
append, clear, copy, count, extend, index, insert, pop, remove, reverse, sort
Tuples
tuple
rainbow = ("Red", "Orange", "Yellow", "Green", "Blue", "Indigo", "Violet")
Methods:
count, index
Strings
str
rainbow = "RAINBOW"
Methods:
capitalize, casefold, center, count, encode, endswith, expandtabs, find, format, format_map, index, isalnum, isalpha, isascii, isdecimal, isdigit, isidentifier, islower, isnumeric, isprintable, isspace, istitle, isupper, join, ljust, lower, lstrip, maketrans, partition, replace, rfind, rindex, rjust, rpartition, rsplit, rstrip, split, splitlines, startswith, strip, swapcase, title, translate, upper, zfill
Sets
set
rainbow = {"Red", "Orange", "Yellow", "Green", "Blue", "Indigo", "Violet"}
Methods:
add, clear, copy, difference, difference_update, discard, intersection, intersection_update, isdisjoint, issubset, issuperset, pop, remove, symmetric_difference, symmetric_difference_update, union, update
Dictionaries
dict
rainbow = {"Red": 12, "Orange": 4.0, "Yellow":72.0, "Green":"Trees", "Blue":"Sad", "Indigo":"56", "Violet":None}
Methods:
clear, copy, fromkeys, get, items, keys, pop, popitem, setdefault, update, values
Each data structure corresponds to a particular organization of data. This organization imposes constraints and affordances on the retrieval and processing of data.
For the five built-in data structures in Python, the following table summarizes the key characteristics of each data structure:
Type |
Collection |
Syntax |
Ordered |
Indexed |
Mutable |
Passed By |
Duplicates Allowed |
---|---|---|---|---|---|---|---|
|
characters |
|
✓ |
✓ |
✗ |
value |
✓ |
|
any data type |
|
✓ |
✓ |
✓ |
reference |
✓ |
|
any data type |
|
✓ |
✓ |
✗ |
value |
✓ |
|
immutable types |
|
✗ |
✗ |
✗ |
value |
✗ |
|
any data type * |
|
✗ |
✓ |
✓ |
reference |
✗** |
* keys can only be immutable type; values can be any data type
** keys can not be duplicate; values can be duplicate
6.1.2. Create, Insert, Read, Update and Delete Operations#
A LOT of problems in computer science can be ‘reduced’ to a very small set of fundamental problems.
Create: Adding value(s) to a structure, inserting a node into a linked list, and inserting an element into a priority queue are all examples of the insert problem. Other names of this problem include: Insert, Add, Post
Read: Reading or accessing value(s) is the most fundamental problem in computer science. Other names of this problem include: Access, Get, Fetch, Retrieve
Update: For each of the following problems, we are given a set of N records, each record containing a key and some associated data, and we are given a particular key K. The problem is to modify the record containing K in some way. Other names of this problem include: Modify, Edit, Patch
Delete: Deleting a record from a database, deleting a node from a linked list, and deleting an element from a priority queue are all examples of the delete problem. Other names of this problem include: Remove, Drop
The table below summarizes the CRUD operations for each of the five built-in data structures in Python:
Type |
Create |
Insert |
Read |
Update |
Delete |
---|---|---|---|---|---|
|
|
|
|
N/A |
N/A |
|
|
|
|
|
|
|
|
N/A |
|
N/A |
N/A |
|
|
|
N/A |
N/A |
|
|
|
|
|
|
|
6.1.3. Mutability#
Mutability is a concept in computer science that refers to whether or not a data structure can be modified after it has been created.
A data structure is said to be mutable if it can be modified after it has been created. Conversely, a data structure is said to be immutable if it cannot be modified after it has been created.
All elementary data types in Python are immutable. This means that once a value is created, it cannot be modified. This might seem counterintuitive since we can modify the value of a variable.
However, under the hood, when we modify the value of a variable, we are actually creating a new value and assigning it to the variable.
Soon afterward in the figure above refers to a process called Garbage Collection. When a new value is assigned to a variable, the old value is not actually modified. Instead, a new value is created and assigned to the variable. Once the new value has been assigned to the variable, the old value is no longer needed and is removed from memory.
Immutable objects don’t change their value even when they are updated. Instead, they create a new object with the updated value as shown in the figure below.
The immutability of elementary data types has some important implications. For example, it means that we can safely pass these data types to functions without worrying that the function will modify the value of the variable that was passed to it. This is because immutable data types are passed to functions by value, not by reference.
In contrast to elementary data types, some native data structures such as lists, sets, and dictionaries are mutable. This means that we can modify the contents of these data structures after they have been created.
This is the reason why we can modify the contents of a list, set, or dictionary without creating a new data structure and assigning it to the variable.
e.g.
rainbow = ["Red", "Orange", "Yellow", "Green", "Blue", "Indigo", "Violet"]
rainbow.append("Pink")
Note that since lists are mutable, .append
method does not return a new list. Instead, it modifies the original list in place.
In contrast, the .replace
method for strings does not modify the original string. Instead, it returns a new string with the specified replacement.
rainbow = "RAINBOW"
rainbow.replace("R", "P")
This is because strings are immutable.
6.1.3.1. Pass by Value vs Pass by Reference#
In Python, all mutable data types are passed to functions by reference, while all immutable data types are passed to functions by value.
When an immutable data type is passed to a function, the function cannot modify the original data structure. This is because the function is given a copy of the original data structure, not a reference to the original data structure.
When a mutable data type is passed to a function, the function can modify the original data structure. This is because the function is given a reference to the original data structure, not a copy of the data structure.
6.1.3.2. Shallow vs Deep Copy#
When we create a new list, set, or dictionary by copying an existing list, set, or dictionary, we can create a shallow copy or a deep copy.
A shallow copy is a new data structure that contains references to the same objects as the original data structure. This means that if we modify an object in the original data structure, the same object in the new data structure will also be modified.
A deep copy is a new data structure that contains new objects that are copies of the objects in the original data structure. This means that if we modify an object in the original data structure, the same object in the new data structure will not be modified.
rainbow = ["Red", "Orange", "Yellow", "Green", "Blue", "Indigo", "Violet"]
rainbow_copy = rainbow
rainbow_copy.append("Pink")
print(rainbow)
In the code above, rainbow_copy
is a shallow copy of rainbow
. This means that rainbow_copy
contains references to the same objects as rainbow
. As a result, when we modify rainbow_copy
, we also modify rainbow
.
rainbow = ["Red", "Orange", "Yellow", "Green", "Blue", "Indigo", "Violet"]
rainbow_copy = rainbow.copy()
rainbow_copy.append("Pink")
print(rainbow)