A Comparative Study of CNN Optimization Methods for Edge AI: Exploring the Role of Early Exits

📰 ArXiv cs.AI

arXiv:2604.14789v1 Announce Type: new Abstract: Deploying deep neural networks on edge devices requires balancing accuracy, latency, and resource constraints under realistic execution conditions. To fit models within these constraints, two broad strategies have emerged: static compression techniques such as pruning and quantization, which permanently reduce model size, and dynamic approaches such as early-exit mechanisms, which adapt computational cost at runtime. While both families are widely

Published 17 Apr 2026

Read full paper → ← Back to Reads