Optimising PyTorch in General
Created: 29 Nov 2022, 06:06 PM | Modified: =dateformat(this.file.mtime,"dd MMM yyyy, hh:mm a")
Tags: knowledge, GeneralDL
https://pytorch.org/tutorials/recipes/recipes/tuning_guide.html
https://twitter.com/karpathy/status/1299921324333170689
PyTorch Performance Tuning Guide - Szymon Migacz, NVIDIA


Faster Deep Learning Training with PyTorch — a 2021 Guide
https://efficientdl.com/faster-deep-learning-in-pytorch-a-guide/
- Consider using a different learning rate schedule.
- Use multiple workers and pinned memory in DataLoader.
- Max out the batch size.
- Use Automatic Mixed Precision (AMP).
- Consider using a different optimizer.
- Turn on cudNN benchmarking.
- Beware of frequently transferring data between CPUs and GPUs.
- Use gradient/activation checkpointing.
- Use gradient accumulation.
- Use DistributedDataParallel for multi-GPU training.
- Set gradients to None rather than 0.
- Use .as_tensor rather than .tensor()
- Turn off debugging APIs if not needed.
- Use gradient clipping.
- Turn off bias before BatchNorm.
- Turn off gradient computation during validation.
- Use input and batch normalization.