Motivating the cross-entropy loss

1 · Parsiad Azimzadeh · Sept. 24, 2023, 5:44 p.m.
Introduction In machine learning, the cross-entropy loss is frequently introduced without explicitly emphasizing its underlying connection to the likelihood of a categorical distribution. Understanding this link can greatly enhance one’s grasp of the loss and is the topic of this short post. Prerequisites maximum likelihood estimator (MLE) Categorical distribution likelihood Consider an experiment in which we roll a (not necessarily fair) $K$-sided die. The result of this roll is an integer be...