Machine Learning | Loss Function
Definition
In a machine learning context, the loss (often also called the error) is a measurement of how far away the predictions of a model are from the actual outcomes—i.e., it's a numerical score that represents how bad the prediction was; the higher the number, the worse the prediction was.
A loss function is a function (in the mathematical sense) that calculates the loss for a set of predictions, given the true outcomes. The loss function aren't automatically given—they must be chosen depending on the context of the data, and common options include residual sum of squares [5] and cross-entropy [7].
Loss Minimization
Since the loss represents how far off the prediction was from the true outcome, it follows that predictions with lower losses are "better", in some numerical sense, than those with higher losses. Thus, when training prediction models, a primary goal is to adjust the parameters of that model to minimize the loss on the training dataset, in the hopes that this would maximize the quality of the predictions when the model is applied to real-world data.
The actual techniques used to minimize the loss function can vary, though (stochastic) gradient descent [6] is very commonly used.
Footnotes/Resources:
[1] Intro to loss functions in machine learning: https://towardsdatascience.com/optimization-of-supervised-learning-loss-function-under-the-hood-df1791391c82
[2] Part 2 of [1]: https://towardsdatascience.com/optimization-loss-function-under-the-hood-part-ii-d20a239cde11
[3] Choosing loss functions for neural networks: https://machinelearningmastery.com/how-to-choose-loss-functions-when-training-deep-learning-neural-networks/
[7] https://machinelearningmastery.com/cross-entropy-for-machine-learning/