Rachit Singh - Deep learning model compression

Created: =dateformat(this.file.ctime,"dd MMM yyyy, hh:mm a") | Modified: =dateformat(this.file.mtime,"dd MMM yyyy, hh:mm a") Tags:

Annotations

One of the most well-known older works in this area prunes filters from a convnet usingthe L1 norm of the filter’s weights. * show annotation

Pruning Filters for Efficient ConvNets

Removing neurons or choosing a subnetwork is what I (and others) considerstructured pruning * show annotation

ocused on sparsifying model weights so that they are more compressible (what some callunstructured pruning) * show annotation

means the matrices are the same size, but some values are set to 0. * show annotation

The best method Iknow of is basically to reset the learning rate (learning rate rewinding) and start retraining thenetwork. * show annotation

https://arxiv.org/pdf/2003.02389.pdf

Quantization generally refers to taking a model with parameters trained at high precision (32 or 64bits) and reducing the number of bits that each weight takes (for example down to 16, 8, or evenfewer). * show annotation

Quantization

Quantization-aware Training (QAT): * show annotation

Quantization Aware Training (QAT)

Darius Knowledge Hub

Explorer

Rachit Singh - Deep learning model compression

Rachit Singh - Deep learning model compression

Annotations

Graph View

Table of Contents

Backlinks