MIT 2023 - TinyML and Efficient Deep Learning Computing (Prof Song Han)


Created: =dateformat(this.file.ctime,"dd MMM yyyy, hh:mm a") | Modified: =dateformat(this.file.mtime,"dd MMM yyyy, hh:mm a") Tags: knowledge


Lightweight AI, Embedded AI, Efficient AI


Overview


  • This course focuses on efficient machine learning and systems.
  • This is a crucial area as deep neural networks demand extraordinary levels of computation, hindering its deployment on everyday devices and burdening the cloud infrastructure.
  • This course introduces efficient AI computing techniques that enable powerful deep learning applications on resource-constrained devices. (Lightweight AI, Embedded AI, Efficient AI)
  • Topics include Neural Network Compression, Pruning, Quantization, Neural Architecture Search, Distributed Training, Data Parallelism / Model Parallelism, Gradient Compression, and On-device Fine-tuning Parameter Efficient Fine-Tuning (PEFT).
  • It also introduces application-specific acceleration techniques for large language models and diffusion models.

p.76

Material

Other information


Chapter 0 - Introduction


EfficientDL - 1. Introduction

EfficientDL - 2. Basics of Deep Learning


Chapter I - Efficient Inference


p.78

EfficientDL - 3. Pruning and Sparsity (Part I)

EfficientDL - 4. Pruning and Sparsity (Part II)

EfficientDL - 5. Quantization (Part I)

EfficientDL - 6. Quantization (Part II)

7. Neural Architecture Search (Part I)

8. Neural Architecture Search (Part II)

9. Knowledge Distillation

Knowledge Distillation

  • do this

10. MCUNet: TinyML on Microcontrollers

11. TinyEngine and Parallel Processing

  • do this

Chapter II - Domain-Specific Optimization


p.79

12. Transformer and LLM (Part I)

Efficient Transformers

13. Transformer and LLM (Part II)

Efficient Transformers

14. Vision Transformer

Vision Transformers (ViT)

15. GAN, Video, and Point Cloud

16. Diffusion Model


Chapter III - Efficient Training


p.80

17. Distributed Training (Part I)

Parallelism

18. Distributed Training (Part II)

Parallelism

19. On-Device Training and Transfer Learning

20. Efficient Fine-tuning and Prompt Engineering

Parameter Efficient Fine-Tuning (PEFT)


Chapter IV - Advanced Topics (Quantum Machine Learning)


21. Basics of Quantum Computing

22. Quantum Machine Learning

23. Noise Robust Quantum ML