Publications

Advancements in Accelerating Deep Neural Network Inference on AIoT Devices: A Survey

Cheng, Long; Gu, Yan; Liu, Qingzhi; Yang, Lei; Liu, Cheng; Wang, Ying

Summary

The amalgamation of artificial intelligence with Internet of Things (AIoT) devices have seen a rapid surge in growth, largely due to the effective implementation of deep neural network (DNN) models across various domains. However, the deployment of DNNs on such devices comes with its own set of challenges, primarily related to computational capacity, storage, and energy efficiency. This survey offers an exhaustive review of techniques designed to accelerate DNN inference on AIoT devices, addressing these challenges head-on. We delve into critical model compression techniques designed to adapt to the limitations of devices and hardware optimization strategies that aim to boost efficiency. Furthermore, we examine parallelization methods that leverage parallel computing for swift inference, as well as novel optimization strategies that fine-tune the execution process. This survey also casts a future-forward glance at emerging trends, including advancements in mobile hardware, the co-design of software and hardware, privacy and security considerations, and DNN inference on AIoT devices with constrained resources. All in all, this survey aspires to serve as a holistic guide to advancements in the acceleration of DNN inference on AIoT devices, aiming to provide sustainable computing for upcoming IoT applications driven by artificial intelligence.