Iterative magnitude pruning
Created: =dateformat(this.file.ctime,"dd MMM yyyy, hh:mm a") | Modified: =dateformat(this.file.mtime,"dd MMM yyyy, hh:mm a")
Tags: knowledge

From Pruning neural networks without any data by iteratively conserving synaptic flow:
Iterative Magnitude Pruning (IMP) is a recently proposed pruning algorithm that has proven to be successful in finding extremely sparse trainable neural networks at initialization (winning lottery tickets) [10, 11, 12, 44, 45, 46, 47].
The algorithm follows three simple steps.
- First train a network,
- second prune parameters with the smallest magnitude,
- third reset the unpruned parameters to their initialization and repeat until the desired compression ratio.
While simple and powerful, IMP is impractical as it involves training the network several times, essentially defeating the purpose of constructing a sparse initialization. That being said it does not suffer from the same catastrophic layer-collapse that other pruning at initialization methods are susceptible to.
Considered “Pruning before training” as per Pruning.
Questions:
- what does prune parameters w smallest magnitude mean? just set the param to zero? ✅ 2023-08-17
- in practice it should be complete remove the param since it is structural pruning. if just set the param as zero, it is unstructured pruning. see Rachit Singh - Deep learning model compression
- does it remain as zero upon repeating training?
Alex Renda, Jonathan Frankle, and Michael Carbin. 2020. Comparing Rewinding and Fine-tuning in Neural Network
Pruning. (2020). arXiv:cs.LG/2003.02389
[10] Jonathan Frankle and Michael Carbin. The lottery ticket hypothesis - Finding sparse, trainable neural networks. In International Conference on Learning Representations, 2019. [11] Jonathan Frankle, G Karolina Dziugaite, DM Roy, and M Carbin. Stabilizing the lottery ticket hypothesis. arXiv, page, 2019. [12] Ari Morcos, Haonan Yu, Michela Paganini, and Yuandong Tian. One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers. In Advances in Neural Information Processing Systems, pages 4933–4943, 2019. [44] Hattie Zhou, Janice Lan, Rosanne Liu, and Jason Yosinski. Deconstructing lottery tickets: Zeros, signs, and the supermask. In Advances in Neural Information Processing Systems, pages 3592–3602, 2019. [45] Haoran You, Chaojian Li, Pengfei Xu, Yonggan Fu, Yue Wang, Xiaohan Chen, Richard G. Baraniuk, Zhangyang Wang, and Yingyan Lin. Drawing early-bird tickets: Toward more efficient training of deep networks. In International Conference on Learning Representations, 2020. [46] Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M Roy, and Michael Carbin. Linear mode connectivity and the lottery ticket hypothesis. arXiv preprint arXiv:1912.05671, 2019. [47] Haonan Yu, Sergey Edunov, Yuandong Tian, and Ari S. Morcos. Playing the lottery with rewards and multiple languages: lottery tickets in rl and nlp. In International Conference on Learning Representations, 2020.