Domino + Fine Grained CLIP
Created: 02 Feb 2023, 05:29 PM | Modified: =dateformat(this.file.mtime,"dd MMM yyyy, hh:mm a")
Tags: knowledge,
Try to see if CLIP is fine grained enough to determine differences in some imagenet classification problem where it only revolves around 1 class ⇒ i.e. find subclasses within the 1 class
Ethical issues - https://oatml.cs.ox.ac.uk/blog/2021/06/27/web-scraped-harmful.html
- going back to classification with imagenet, but trying for a single class / single group of classes
⇒ tried to do for group of classes that make sense with “car”, but it doesnt seem to be very descriptive
⇒ tried with single class, see if it is able to find clusters for single class, and describe it in a manner that is descriptive enough
- e.g. a photo of a car in winter conditions?
⇒ intuition based on current testing is that it can’t describe to that level of detail
⇒ need to update the phrase templates / mask somehow, or use a version of CLIP pretrained on automotive image-text pairs
⇒ but mask is already “a photo of a [xxxx] {} [xxxx]”
→ results not great, does not seem to show much descriptions for the 1 class