On-Device Training Under 256KB Memory


Created: =dateformat(this.file.ctime,"dd MMM yyyy, hh:mm a") | Modified: =dateformat(this.file.mtime,"dd MMM yyyy, hh:mm a") Tags: knowledge,

Annotations


imization difficulty, we propose Quantization-Aware Scaling (QAS) to automatically scale the gradient of tensors with differentbit-precisions, which effectively stabilizes the training and matches the accuracy of the floating-point counterpart (Section 2.1). QAS is hyper-para show annotation