loading page

Label Wise Significance Cross Entropy
  • Zhiyong Jin
Zhiyong Jin
Author Profile

Abstract

In the process of machine learning, models are essentially defined by a group of parameters in multiple layers. The parameters are learnt in a process of optimization of the loss function which measures the differences between model output and expected outputs (labels). Normally, the expected class digit is expected to be 1 and the other digits are 0s. All the digits are contributing to the loss function in the cross-entropy loss function. If the number of classes is high, the total number of 0 digits is much larger than the number of 1 digit, and the total loss of 0 digits would be higher than the total loss of 1 digit loss. The direct result is the loss does not conform with the object of the optimization. This paper introduced Label Wise Significance Cross Entropy (LWSCE) as the loss function in model training for effectively allocating the weighted loss function on the classes bearing significant differences between model output and expected output and ignoring the minor differences of 0 digits if the differences are not significant enough. This method was experimented on CIFAR10 and CIFAR100 dataset training. The experiments of cross-entropy, cross-entropy with label smoothing and LWSCE were carried out in this paper. The result shows LWACE outperforms better than the other loss functions in CIFAR100 dataset.