Domino - Slice Discovery Method


Created: 14 Dec 2022, 03:38 PM | Modified: =dateformat(this.file.mtime,"dd MMM yyyy, hh:mm a") Tags: knowledge,


*slice discovery:*the task of mining unstructured input data for semantically meaningful subgroups on which a model performs poorly.

From <https://hazyresearch.stanford.edu/blog/2022-04-02-domino>

  • Budapest challenge that they face
    • In the end the KPI to describe the models, the AP, categorical accuracy, but in the end don’t know how to choose between different models
    • Looks good in terms of metric, but when deploy might lose frames in certain objects, but if multiple identical frames in the video it can have problems in detection
    • Difficult to deal with because you can’t fine tune several frames, don’t know how to even start understanding the problem
  • Budapest requested if we can propose some automotive domain metrics to help with this kinda issues
    • Seems hard, since we don’t have any specific domain knowledge
  • Good to show how there are failure modes
  • If there is extra time, can look into this further
  • Is a good track to help with them if we dig deeper to see its viability, if is not just paper based
  • Possible to add into MTL project, or into another full blown project, attract stakeholders attention more too

https://colab.research.google.com/github/HazyResearch/domino/blob/main/examples/01_intro.ipynb

https://faitpoms.com/

https://snorkel.ai/forager-rapid-data-exploration-model-development/

Forager Demo

What is preds == in label name?

  • See preds_idx and preds_name

What is name? is it the ground truth label?

  • Yes, it is the ground truth label

Target = imagenet label for the task

  • 1 means that the image was labeled as a car
  • 0 means that it was labeled as something other than a car.

Notably, several images of cars seem to be mislabeled with a 0! The prob column contains the probability, according to the model, that the image is of a car.

Natural language descriptions can be weird!

Where target is false, means that the class name does not exist with “car.n.01” as its hypernym

So car wheel not subset of car

But prob is probability of the image classified as a “car”

i.e. first image was predicted to be a subclass of “car” since prob == 0.8297, but the image is wrongly labelled as car wheel (i.e. not subclass of car)

e.g. this, shows that police cars seem to have error, why?

  • Seems like all are not target, i.e. all the GT labels are not subsets of “car”

While the model predicted the images as some subset of car!

Shows as a False Positive!

But actually, it seems like the data is labelled poorly

e.g. police van should also be in the “car.n.01” hypernym

And the golfcart image, has a small golf cart, but a large ambulance / police vehicle

In this case, again all these items have target = false

i.e. they are all False Positives

But, when look closer, the data seems to have issue as it should have labelled it as a car as its hypernym

Model is stiill predicting car types

When collating across slices:

These labels seem to have a different hypernym (not car)

  • tow truck, moving van, minibus, recreational vehicle,

golfcart, police van, garbage truck, passenger car

So can try to add these synsets into target synset too, then see what happens to the discovered slices

This is the first real case of showing the model failure?

  • i.e. domino shows that the model predicts poorly in scenario with car interior
  • and domino also describes it as a car interior correctly