Rachit Singh - Deep learning model compression
Created: =dateformat(this.file.ctime,"dd MMM yyyy, hh:mm a") | Modified: =dateformat(this.file.mtime,"dd MMM yyyy, hh:mm a")
Tags:
Annotations
- One of the most well-known older works in this area prunes filters from a convnet usingthe L1 norm of the filter’s weights. * show annotation
Removing neurons or choosing a subnetwork is what I (and others) considerstructured pruning * show annotation
ocused on sparsifying model weights so that they are more compressible (what some callunstructured pruning) * show annotation
means the matrices are the same size, but some values are set to 0. * show annotation
- The best method Iknow of is basically to reset the learning rate (learning rate rewinding) and start retraining thenetwork. * show annotation
- Quantization generally refers to taking a model with parameters trained at high precision (32 or 64bits) and reducing the number of bits that each weight takes (for example down to 16, 8, or evenfewer). * show annotation
- Quantization-aware Training (QAT): * show annotation