Keras compile(loss_weights) vs fit(class_weight) vs Torch loss weights


Created: =dateformat(this.file.ctime,"dd MMM yyyy, hh:mm a") | Modified: =dateformat(this.file.mtime,"dd MMM yyyy, hh:mm a") Tags: knowledge


tf.keras.Model  |  TensorFlow v2.11.1

model.compile(loss_weights=xxx)

  • Optional list or dictionary specifying scalar coefficients (Python floats) to weight the loss contributions of different model outputs. The loss value that will be minimized by the model will then be the weighted sum of all individual losses, weighted by the loss_weights coefficients. If a list, it is expected to have a 1:1 mapping to the model’s outputs. If a dict, it is expected to map output names (strings) to scalar coefficients.
  • loss_weights parameter on compile is used to define how much each of your model output loss contributes to the final loss value ie. it weighs the model output losses. You could have a model with 2 outputs where one is the primary output and the other auxiliary. eg. 1. * primary + 0.3 * auxiliary. The default values for loss weights is 1.

tf.keras.Model  |  TensorFlow v2.11.1 model.fit(class_weight=xxx)

  • Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function (during training only). This can be useful to tell the model to “pay more attention” to samples from an under-represented class.

  • class_weight parameter on fit is used to weigh the importance of each sample based on the class they belong to, during training. This is typically used when you have an uneven distribution of samples per class.

  • this seems to be the closest to the intended usecase of wanting to weight the loss function based on the classes

Difference between loss_weights and class_weights · Issue #10507 · keras-team/keras · GitHub

tf.keras.losses.SparseCategoricalCrossentropy  |  TensorFlow v2.11.1

CrossEntropyLoss — PyTorch 2.1 documentation  - If provided, the optional argument weight should be a 1D Tensor assigning weight to each of the classes. This is particularly useful when you have an unbalanced training set.  - It just means the weight that you give to different classes. Basically, for classes with small number of training images, you give it more weight so that the network will be punished more if it makes mistakes predicting the label of these classes. For classes with large numbers of images, you give it small weight.  - What is the weight values mean in torch.nn.CrossEntropyLoss? - PyTorch Forums