Neural Network Calibration
Created: =dateformat(this.file.ctime,"dd MMM yyyy, hh:mm a") | Modified: =dateformat(this.file.mtime,"dd MMM yyyy, hh:mm a")
Tags: knowledge
Papers
- Dual Focal Loss for Calibration
- On Calibration of Modern Neural Networks
- Expected Calibration Error
- Temperature scaling
Introduction
-
DNNs are highly successful in computer vision tasks.
-
While accuracy is a focus, the reliability of confidence scores (uncertainty) is equally important.
-
DNNs often overestimate confidence scores, making them unreliable.
- For instance, if a DNN assigns a confidence score of 0.8 to a set of predictions, it should be correct 80% of the time.
-
Accurate calibration of uncertainty is crucial for real-world applications.
Transclude of Neural-Network-Calibration-2023-10-26-15.40.09.excalidraw
-
Confidence calibration refers to the capacity of a model to furnish accurate probability estimates for its predictions.
- In simpler terms, when a neural network assigns a confidence level of 0.2 to an image being a cat, this confidence score should accurately reflect a 20% likelihood of the prediction being correct, provided that the neural network is appropriately calibrated.
-
By associating each prediction with a calibrated probability score, it becomes possible to identify and discard low-quality predictions.
-
Consequently, even if we don’t have a complete understanding of the neural network’s inner workings, confidence calibration offers a practical method for averting significant errors in real-world scenarios by providing accurate uncertainty assessments for each prediction.
SoftMax Score and Model Confidence
- Practitioners often erroneously interpret predictive probabilities obtained from a neural network (i.e., the SoftMax scores) as model confidence. However, it is widely known that modern neural networks often make poor (i.e., incorrect) predictions with SoftMax scores of nearly 100%, making predictive probability a poor and misleading estimate of true confidence.
Questions
- what about underconfidence?