ENERGY- AND LATENCY-EFFICIENT DNN COMPRESSION FOR EDGE IOT SYSTEMS

Jessica Marie Clark

Authors

Jessica Marie Clark Author

Keywords:

DNN Compression, IoT, Low-Latency Computing, Model Pruning, Quantization, Edge AI, Real-Time Processing.

Abstract

The rapid growth of Internet of Things (IoT) systems has emphasized the need for efficient deep neural network (DNN) processing under stringent latency and resource constraints. Conventional DNNs require significant computational power, making them unsuitable for real-time IoT deployments with limited memory, bandwidth, and processing capability. This paper proposes a low-latency DNN compression framework that combines structured pruning, quantization-aware training, and lightweight model reparameterization. The proposed method reduces computational complexity while maintaining competitive accuracy, enabling faster inference on edge IoT devices. Experimental evaluations demonstrate up to 62% reduction in model size and 48% improvement in inference speed. The approach provides a scalable and energy-efficient solution for real-time IoT applications.

ENERGY- AND LATENCY-EFFICIENT DNN COMPRESSION FOR EDGE IOT SYSTEMS

Authors

Keywords:

Abstract

Downloads

Published

Issue

Section