Slice
Created: 23 Dec 2022, 02:27 PM | Modified: =dateformat(this.file.mtime,"dd MMM yyyy, hh:mm a")
Tags: knowledge,

Then run slice == mixture model
domino = DominoSlicer(
y_log_likelihood_weight=40,
y_hat_log_likelihood_weight=40,
n_mixture_components=60,
n_slices=10,
max_iter=10,
n_pca_components=128,
init_params=“confusion”,
confusion_noise=3e-3
)
domino.fit(data=dp, embeddings=“clip(image)”, targets=“target”, pred_probs=“prob”)
dp[“domino_slices”] = domino.predict_proba(
data=dp, embeddings=“clip(image)”, targets=“target”, pred_probs=“prob”
)
What does error-aware mean?
Mixture model == GMM? What’s the diff?
How does it get the slices?
- Formulate the MM using input embeddings (clip image), class labels (target), model predictions (prob)
- Embeddings modelled as Gaussians, labels and predictions modelled as categoricals
- Get the log-likelihood, maximise using Expectation-Maximization (EM)
- Where EM is an approach for Maximum Likelihood Estimation (MLE) ⇒ EM is better than MLE, where MLE may not be able to find closed form solution, EM can.
- EM is done in fit_predict
What does fit() do?
- MixtureSlicer.fit() calls DominoMixture.fit()
- DominoMixture.fit() calls DominoMixture.fit_predict()
- DominoMixture.fit_predict() does the E step and M step in for n_iter in DominoMixture._e_step and DominoMixture._m_step
- Then after EM algo, will have gotten… what? The clusters?
DominoSlicer ⇒ MixtureSlicer, MixtureSlicer.mm ⇒ DominoMixture
What is MixtureSlicer.predict_proba doing?
- Gives probabilistic slice assignments / clusters
- Calls DominoMixture.predict_proba, which calls _estimate_log_prob_resp that
- Estimates log probabilities and responsibilities for each sample.
- Compute the log probabilities, weighted log probabilities per component and responsibilities for each sample in X with respect to the current state of the model.
- Will return the exponent of log responsibilities:
_, log_resp = self._estimate_log_prob_resp(X, y, y_hat)
return np.exp(log_resp)
- fit()Learn a set ofslicing functionsthat partition the embedding space into regions with high error rates.
- predict()Apply the learned slicing functions to data, assigning each datapoint to zero or more slices.
- predict_proba()Apply the learned slicing functions to data, producing “soft” slice assignments.
From <https://domino-slice.readthedocs.io/en/latest/apidocs/slice.html#slice-reference>



What is “target”??
Am I interpreting wrongly?

From paper: https://arxiv.org/pdf/2203.14960.pdf