Domino - Slice Discovery Method
Created: 14 Dec 2022, 03:38 PM | Modified: =dateformat(this.file.mtime,"dd MMM yyyy, hh:mm a")
Tags: knowledge,
- Usefulness of https://hazyresearch.stanford.edu/blog/2022-04-02-domino?
- In terms of discovering failure modes, i.e. where the model predicts very poorly and to see what kind of trends there are in the data that causes the model to fail
*slice discovery:*the task of mining unstructured input data for semantically meaningful subgroups on which a model performs poorly.
From <https://hazyresearch.stanford.edu/blog/2022-04-02-domino>
- Budapest challenge that they face
- In the end the KPI to describe the models, the AP, categorical accuracy, but in the end don’t know how to choose between different models
- Looks good in terms of metric, but when deploy might lose frames in certain objects, but if multiple identical frames in the video it can have problems in detection
- Difficult to deal with because you can’t fine tune several frames, don’t know how to even start understanding the problem
- Budapest requested if we can propose some automotive domain metrics to help with this kinda issues
- Seems hard, since we don’t have any specific domain knowledge
- Good to show how there are failure modes
- If there is extra time, can look into this further
- Is a good track to help with them if we dig deeper to see its viability, if is not just paper based
- Possible to add into MTL project, or into another full blown project, attract stakeholders attention more too
https://colab.research.google.com/github/HazyResearch/domino/blob/main/examples/01_intro.ipynb
https://snorkel.ai/forager-rapid-data-exploration-model-development/


What is preds == in label name?
- See preds_idx and preds_name
What is name? is it the ground truth label?
- Yes, it is the ground truth label
Target = imagenet label for the task
- 1 means that the image was labeled as a car
- 0 means that it was labeled as something other than a car.
Notably, several images of cars seem to be mislabeled with a 0! The prob column contains the probability, according to the model, that the image is of a car.

Natural language descriptions can be weird!
Where target is false, means that the class name does not exist with “car.n.01” as its hypernym
So car wheel not subset of car
But prob is probability of the image classified as a “car”
i.e. first image was predicted to be a subclass of “car” since prob == 0.8297, but the image is wrongly labelled as car wheel (i.e. not subclass of car)


e.g. this, shows that police cars seem to have error, why?
- Seems like all are not target, i.e. all the GT labels are not subsets of “car”
While the model predicted the images as some subset of car!
Shows as a False Positive!
But actually, it seems like the data is labelled poorly
e.g. police van should also be in the “car.n.01” hypernym
And the golfcart image, has a small golf cart, but a large ambulance / police vehicle

In this case, again all these items have target = false
i.e. they are all False Positives
But, when look closer, the data seems to have issue as it should have labelled it as a car as its hypernym
Model is stiill predicting car types
When collating across slices:
These labels seem to have a different hypernym (not car)
- tow truck, moving van, minibus, recreational vehicle,
golfcart, police van, garbage truck, passenger car
So can try to add these synsets into target synset too, then see what happens to the discovered slices

This is the first real case of showing the model failure?
- i.e. domino shows that the model predicts poorly in scenario with car interior
- and domino also describes it as a car interior correctly