Images

An image is a 2D array of pixels. Each pixel is a 3D vector of red, green, and blue (RGB) values. The RGB values are usually represented as integers between 0 and 255. The RGB values are used to represent the color of the pixel. For example, a pixel with RGB values of (255, 0, 0) is red, (0, 255, 0) is green, and (0, 0, 255) is blue. A pixel with RGB values of (0, 0, 0) is black and (255, 255, 255) is white.

In this notebook, we will learn how to read and write images, and how to manipulate them.

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd 

plt.style.use('dark_background')

# Read an image
!wget https://raw.githubusercontent.com/fahadsultan/csc272/main/data/belltower.png
img = plt.imread('belltower.png');
plt.imshow(img);
--2024-10-17 08:17:16--  https://raw.githubusercontent.com/fahadsultan/csc272/main/data/belltower.png
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.110.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3604223 (3.4M) [image/png]
Saving to: 'belltower.png'

belltower.png       100%[===================>]   3.44M  --.-KB/s    in 0.1s    

2024-10-17 08:17:17 (29.1 MB/s) - 'belltower.png' saved [3604223/3604223]

img.shape # (height, width, channels)
(1600, 1267, 4)
img[0,0,:] # RGB values of the first pixel
array([0.9411765 , 0.94509804, 0.9529412 , 1.        ], dtype=float32)
# Convert to grayscale
img_gray = img.mean(axis=2)
img_df   = pd.DataFrame(img_gray)
img_df.shape
(1600, 1267)
# Convert to grayscale
img_gray = img.mean(axis=2)
img_df   = pd.DataFrame(img_gray)

plt.imshow(img_df, cmap='gray');
plt.colorbar();

plt.imshow(256-img_df, cmap='gray'); 
plt.colorbar();

# Threshold the image
img_thresh = img_gray > 0.5 
plt.imshow(img_thresh, cmap='gray');

## Crop the image
img_crop = img_df.iloc[:700, :700]
plt.imshow(img_crop, cmap='gray');

# Rotate the image
img_rot = img_df.transpose()
plt.imshow(img_rot, cmap='gray');

Videos

Videos are a sequence of images. Within the context of videos, each image is called a frame. Most videos are a sequence of 24-30 frames per second.

Most modern videos are encoded using a variety of different codecs. A codec is a method of encoding and decoding a video. Some common codecs are H.264, MPEG-4, and VP9.