Data#

Important

This page is under construction. All or part of the content may be incomplete or incorrect.

Data is to computer science what chemicals are to chemistry or organisms are to biology.

Data is the raw material that we process, analyze, and transform into insight and knowledge.

In this section, we will learn about how to represent elementary data in Python. Later, we will learn about how to represent more complex data structures.


Values and Types#

  • The atomic indivisible unit of data in computer programming is called a value.

  • Values are the most basic things that a computer program manipulates or calculates.

For example, the number 42 is a value. So is "Hello World!".

  • Each value belongs to a type.

  • The type of a value determines its interpretation by the computer and the operations that can be performed on it.

For example, the value 42 is of type int (short for integer) and the value "Hello World!" is of type str (short for string, so-called because it contains a string of letters).

Python comes with the following built-in data types:

Python Data Type

Description

Category

Mutable

Example Values

int

Integers

Numeric

42, 0, -1, 10000000000

float

Floating point numbers

Numeric

3.14159, 0.0, -1.0, 1.0e10

complex

Complex numbers

Numeric

3 + 4j, 1j

bool

Boolean values

Boolean

True, False

str

String values

Text

"Hello World!", "42"

list

Ordered mutable sequences of values

Sequence

[1, 2, 3], ["Hello", "World"]

tuple

Ordered immutable sequences of values

Sequence

(1, 2, 3), ("Hello", "World")

range

Immutable sequence of numbers

Sequence

range(10), range(1, 10, 2)

dict

Unordered mapping of keys to values

Mapping

{"a": 1, "b": 2}

set

Unordered collection of unique values

Set

{1, 2, 3}

frozenset

Immutable set

Set

frozenset({1, 2, 3})

bytes

Sequence of bytes

Binary

b"Hello World!"

bytearray

Mutable sequence of bytes

Binary

bytearray(b"Hello World!")

memoryview

Memory view of bytes

Binary

memoryview(b"Hello World!")

NoneType

Special type indicating no value

NoneType

None

Variables#

One of the most powerful features of a programming language is the ability to manipulate variables.

Similar to algebra, variables in computer programming are names that refer to values.

In algebra, the following statement declares that the variable x has the value 42:

\[x = 42\]

In Python, the following statement declares that the variable x has the value 42:

x = 42

In Python, a variable is a just a name. Values are somewhere else, and a variable refers to a value. Multiple names can refer to the same value. Python calls whatever is needed to refer to a value a reference. Assigning to a variable (or object field, or …) simply makes it refer to another value. The whole model of storage locations does not apply to Python, the programmer never handles storage locations for values. All he stores and shuffles around are Python references, and those are not values in Python, so they cannot be target of other Python references.

Variables are the first means of abstraction in computer programming. They allow us to abstract away the details of the value and refer to computations in more general terms.

Rules for Naming Variables:

In Python, variable names can be arbitrarily long. They can contain both letters and numbers, but they have to begin with a letter or an underscore (variable names can not start with a number).

  • Variable Names must be descriptive!

  • A variable name must start with a letter or the underscore character

  • A variable name cannot start with a number

  • A variable name can only contain alpha-numeric characters and underscores (A-z, 0-9, and _ )

  • Variable names are case-sensitive (age, Age and AGE are three different variables)

  • In Python, the convention is to NOT use upper case letters anywhere in the name. In case of multi-word names, use _ .

Dynamic and Strong Typing#

You (and Python) can identify strings because they are enclosed in quotation marks ".

Python is a dynamically typed language, which means that the type of a value is inferred at runtime (as opposed to compile time) based on the value.

Python is a dynamically typed language, which means two things:

  1. The type of a value is inferred at runtime (as opposed to compile time) based on the value

  2. The same variable can be assigned values of different types at different times.

https://res.cloudinary.com/practicaldev/image/fetch/s--i1yqfSl1--/c_imagga_scale,f_auto,fl_progressive,h_900,q_auto,w_1600/https://miro.medium.com/max/1400/1%2ABddwVWW6hFU0miT9DCbUWQ.png

Different programming languages have different approaches to types. Python is a dynamically typed and strongly typed programming language.#

Python is also a strongly typed language, which means that every

The = symbol in Python is called the assignment operator. It allows us to creates new variables (names) and assign (refer) them values.

Note that the assignment operator assigns the value on the right to the variable on the left. In other words, the directionality of assignment in x = 42 is \(x \leftarrow 42\).

Variables can naturally be assigned different values at different times.

In fact, the same variable can be assigned values of different types at different times.

message = "What's up, Doc?"
n = 17
pi = 3.14159

This example makes three assignments. The first assigns the string "What's up, Doc?" to a new variable named message. The second gives the integer 17 to n, and the third gives the floating-point number 3.14159 to pi.

The assignment operator, =, should not be confused with an equals sign (even though it uses the same character). Assignment operators link a name, on the left hand side of the operator, with a value, on the right hand side. This is why you will get an error if you enter:


17 = n # Error: can't assign to literal

A common way to represent variables on paper is to write the name with an arrow pointing to the variable’s value. This kind of figure is called a state diagram because it shows what state each of the variables is in (think of it as the variable’s state of mind). This diagram shows the result of the assignment statements

The print statement also works with variables.

print(message), print(n), print(pi)
What's up, Doc?
17
3.14159
(None, None, None)

In each case the result is the value of the variable. Variables also have types; again, we can ask the interpreter what they are.

type(message), type(n), type(pi)
(str, int, float)

The type of a variable is the type of the value it refers to.

ten_percent = 3
ten_percent
3
Ten_percent = 10
Ten_percent
10
TEN_PERCENT = 10000
TEN_PERCENT
10000
ten_percent = 5
ten_percent
5
age = 2
age
2
Age = 20
Age
20
AGE = 200
AGE
200

Which of the following variable names would throw a Syntax Errox?

10_percent = 27.03 / 10

ten_percent = 27.03 / 10
ten_percent

_10_percent = 27.03 / 10
_10_percent

20_percent = 27.03 / 10

twenty_percent = 27.03 / 10

twenty_% = 27.03 / 10
  Input In [11]
    10_percent = 27.03 / 10
      ^
SyntaxError: invalid decimal literal

Reserved words

In Python, there are 33 reserved words. These are words that have special meaning to Python. You cannot use reserved words as variable names.

False, def, if, raise, None, del, import, return, True, elif ,in, try, and, else, is, while, as, except, lambda, with assert, finally, nonlocal, yield break, for, not class, from, or, continue, global, pass

  • A programming language is more than just a means for instructing a computer to perform tasks.

  • It also serves as a framework within which we organize our ideas about computational processes.

  • Programs serve to communicate those ideas among the members of a programming community.

  • Thus, let me re-emphasize programs must be written for people to read, and only incidentally for machines to execute.


  • You can get the data type of any variable using the built-in type() function

  • Python programming language is both:

  1. Strongly typed:

    • All variables have a type

    • The type matters when performing operations on a variable.

  2. Dynamically typed:

    • In Python, the data type is set when you assign a value to a variable (during runtime).

Glossary#

value#

The most basic thing a program works with. Values include numbers (like 42 or 3.14159) and strings (like "Hello World!").

type#

A category of values. The types we have seen so far are integers (type int), floating-point numbers (type float), and strings (type str).

variable#

A name that refers to a value.

Strongly typed#

All variables have a type and the type matters when performing operations on a variable.

Dynamically typed#

The data type is set when you assign a value to a variable (during runtime).

Weakly typed#

The data type is set when you assign a value to a variable (during compile time).

Static typed#

The data type is set when you assign a value to a variable (during compile time).


Exercises#

  1. Which of the following are valid variable names in Python? Why or why not?

Variable Name

Valid?

Why or Why Not?

i.

age

ii.

Age

iii.

AGE

iv.

age_

v.

age!

vi.

age@

vii.

age$

viii.

False

ix.

false

x.

while

xi.

_while

xii.

while_

xiii.

def

xiv.

if

xv.

elif

xvi.

else

xvii.

return

xviii.

None

xix.

none

xx.

NoneType

xxi.

none_type

xxii.

none-type