Machine Learning Compilers (Graph Compilers)


Created: 03 Jan 2023, 12:49 PM | Modified: =dateformat(this.file.mtime,"dd MMM yyyy, hh:mm a") Tags: knowledge, tools, TinyML


Overview

Introduction

Transclude of Machine-Learning-Compilers-(Graph-Compilers)-2025-10-30-11.23.21.excalidraw

Youtube Video, Blogpost

As more and more companies want to bring ML to the edge, and more and more hardware is being developed for ML models, more and more compilers are being developed to bridge the gap between ML models and hardware accelerators---MLIR dialects, Apache TVM, XLA, PyTorch Glow, cuDNN, etc..

Intermediate representation (IR) as middle man

High-level IRs and low-level IRs.

High level IRs includes MLIR / TVM

MLIR

TVM

“lowering”

Python can’t run on a GPU. To bridge this gap, researchers build Embedded Domain-Specific Languages (eDSLs) e.g. OpenAI Triton.

  • An “eDSL” is a DSL that re-uses an existing language’s syntax—but changes how the code works with compiler techniques.
  • eDSLs work their magic by capturing Python code before it runs and transforming it into a form they can process. They typically leverage decorators, a Python feature that intercepts functions before they run.

IREE

Optimization through IREE Compiler

IREE (Intermediate Representation Execution Environment) is an MLIR-based end-to-end AI/ML compiler and runtime. The Architecture Overview is shown in figure4. In IREE, the input model is lowered to MLIR and then different levels of optimizations are applied (such as kernel fusion, tiling, and loop unrolling) and finally translated to target-dependent VM Bytecode. The VM Bytecode is able to execute with IREE runtime.


Theoretical References

Papers

Articles

Courses


Code References

Methods

Tools, Frameworks