Low-Rank Factorization
Created: =dateformat(this.file.ctime,"dd MMM yyyy, hh:mm a") | Modified: =dateformat(this.file.mtime,"dd MMM yyyy, hh:mm a")
Tags: knowledge
Overview
Related fields
Introduction
-
Key idea: replace high-dimensional tensors with lower-dimensional tensors to reduce the number of parameters.
- For example, you can decompose a 3x3 tensor into the product of a 3x1 and a 1x3 tensor, so that instead of having 9 parameters, you have only 6 parameters. (from Open challenges in LLM research)
-
Orthorgonal to Pruning
-
Reduce the number of ranks
-
Special case of pruning but not necessarily the same
- if weight matrix is low rank, multiply together the dims still the same
Traditional / Historical methods ⇒ purely factorisation
- focuses on low rank structures of the trained weights
- but not much focus on low-rank updating of a frozen model for adaptation to downstream tasks (LoRA)
- low rank structure obtained via factorisation
- identifies redundant parameters of deep neural networks by employing the matrix and tensor decomposition
- e.g. using SVD
- main challenge is that the decomposition process results in harder implementations and is computationally intensive
- factorisation can:
- be applied during or after training
- be applied to both convolutional and fully connected layers
- reduce training time when applied during training
- reduce model size and improve speed up to 30-50% compared to the full-rank matrix representation, when factorising dense layer matrices
- typically not in the scenario of a low-rank update to a frozen model for adaptation to downstream tasks (i.e. LoRA / PEFT)
Newer methods
- Stems from Parameter Efficient Fine-Tuning (PEFT) task
- partial finetuning of a pre-trained mode in an efficient manner (be it in inference time, or compute reqs, or memory reqs)
Questions
- how does LoRA differ from SVD type factorisations per Low-Rank Factorization? when people talk about low rank factorisation are they referring to LoRA or SVD style? ✅ 2024-01-04
- LoRA is a method of finetuning / parameter efficient fine tuning
- factorisation is typically for model compression only, not so much on finetuning
- factorisation ⇒ SVD style