Lightweight AI, Embedded AI, Efficient AI

Created: =dateformat(this.file.ctime,"dd MMM yyyy, hh:mm a") | Modified: =dateformat(this.file.mtime,"dd MMM yyyy, hh:mm a") Tags: knowledge

Overview

Introduction

Theoretical References

Papers

Articles

Courses

MIT 2022 - TinyML EfficientML Course (Prof Song Han)
- annotation of the pdf file MIT 2022 - TinyML EfficientML Course (Prof Song Han).pdf
- partial
MIT 2023 - TinyML and Efficient Deep Learning Computing (Prof Song Han)
- annotation of the pdf file MIT 2023 - TinyML EfficientML Course (Prof Song Han).pdf
- in progress

Code References

Methods

Tools, Frameworks

Apache TVM
AIMET
SharkML
- C++ implementation of DL algos
cppflow
- C++ implementation of other algos
mlpack
- fast, header-only C++ machine learning library
- a machine learning analog to LAPACK
emlearn
- Inference engine for Microcontrollers and Embedded devices (C99)
- sklearn, Keras compatible
m2cgen
- Transform ML models into a native code (Java, C, Python, Go, JavaScript, Visual Basic, C#, R, PowerShell, PHP, Dart, Haskell, Ruby, F#, Rust) with zero dependencies
micromlgen
- Generate C code for microcontrollers from Python’s sklearn classifiers
NVlabs/tiny-cuda-nn
- C++/CUDA neural network framework
TIDL - TI deep learning used by Conti AM ADAS team
FFCV - drop-in data loading system that dramatically increases data throughput in model training
apple/ml-cvnets: CVNets: A library for training computer vision networks - apple CV networks e.g. MobileViT
ggerganov/ggml: Tensor library for machine learning
- ggml = GPT-Generated Model Language
- updated in August 2023 to GGUF = GPT-Generated Unified Format
- whisper.cpp - openai automatic speech recognition model local inference
- llama.cpp = meta llama llm local inference
DeepSpeed - Microsoft Research
- An open source deep learning optimization library for PyTorch
- DeepSpeed-Training
  - ZeRO, 3D-Parallelism, DeepSpeed-MoE, ZeRO-Infinity
- DeepSpeed-Inference, [Blog]
  - Parallelism technology: tensor, pipeline, expert and ZeRO-parallelism
  - custom inference kernels, communication optimizations and heterogeneous memory technologies
- DeepSpeed-Compression, [Blog]
  - ZeroQuant, XTC
FairScale - PyTorch extension library for high performance and large scale training
IntelLabs/distiller: Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research.
facebookresearch/d2go: D2Go is a toolkit for efficient deep learning
- end-to-end model training and deployment for mobile platforms
- deep learning toolkit powered by PyTorch and Detectron2
GitHub - hpcaitech/ColossalAI: Making large AI models cheaper, faster and more accessible
GitHub - pytorch/executorch: On-device AI across mobile, embedded and edge for PyTorch
GitHub - pytorch/ao: PyTorch native quantization and sparsity for training and inference
GitHub - VainF/Torch-Pruning: [CVPR 2023] DepGraph: Towards Any Structural Pruning
GitHub - usefulsensors/onnx_shrink_ray: Shrinks ONNX files by quantizing large float constants into eight bit equivalents.
GitHub - daquexian/onnx-simplifier: Simplify your onnx model

Darius Knowledge Hub

Explorer

Lightweight AI, Embedded AI, Efficient AI

Lightweight AI, Embedded AI, Efficient AI

Overview

Introduction

Theoretical References

Papers

Articles

Courses

Code References

Methods

Tools, Frameworks

Graph View

Table of Contents

Backlinks

Darius Knowledge Hub

Explorer

Lightweight AI, Embedded AI, Efficient AI

Lightweight AI, Embedded AI, Efficient AI

Overview

Related fields

Introduction

Theoretical References

Papers

Articles

Courses

Code References

Methods

Tools, Frameworks

Graph View

Table of Contents

Backlinks