7.2. Differential Calculus#
7.2.1. Cost and Objective Functions#
Mean squared error is also known as the L2 loss function. It is defined as follows:
where
Other common loss functions include the L1 loss function, which is defined as follows:
The L1 loss function is also known as the mean absolute error (MAE)
The L1 loss function is less sensitive to outliers than the L2 loss function. For example, if we have the following two vectors:
The MSE between these two vectors is 1, while the MAE is 18.4. The MSE is more sensitive to outliers because it squares the difference between the true and predicted values. The L1 loss function is less sensitive to outliers because it takes the absolute value of the difference between the true and predicted values.
7.2.2. Optimization#
Optimization is the process of finding the minimum (or maximum) of a function that depends on some inputs, called design variables. In machine learning, we often want to find the minimum of a loss function, which is a function that measures how bad our model is. For example, in linear regression, we want to find the parameters that minimize the mean squared error (MSE) between the predictions of our model and the true values.