An adaptive multi-level quantization-based reinforcement learning model for enhancing UAV landing on moving targets

The autonomous landing of an unmanned aerial vehicle (UAV) on a moving platform is an essential functionality in various UAV-based applications. It can be added to a teleoperation UAV system or part of an autonomous UAV control system. Various robust and predictive control systems based on the tradi...

Full description

Saved in:
Bibliographic Details
Main Authors: Abo Mosali, Najmaddin, Shamsudin, Syariful Syafiq, Mostafa, Salama A., Alfandi, Omar, Omar, Rosli, Al-Fadhali, Najib, Mohammed, Mazin Abed, Malik, R. Q., Jaber, Mustafa Musa, Saif, Abdu
Format: Article
Published: MDPI 2022
Subjects:
Online Access:http://eprints.um.edu.my/41645/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The autonomous landing of an unmanned aerial vehicle (UAV) on a moving platform is an essential functionality in various UAV-based applications. It can be added to a teleoperation UAV system or part of an autonomous UAV control system. Various robust and predictive control systems based on the traditional control theory are used for operating a UAV. Recently, some attempts were made to land a UAV on a moving target using reinforcement learning (RL). Vision is used as a typical way of sensing and detecting the moving target. Mainly, the related works have deployed a deep-neural network (DNN) for RL, which takes the image as input and provides the optimal navigation action as output. However, the delay of the multi-layer topology of the deep neural network affects the real-time aspect of such control. This paper proposes an adaptive multi-level quantization-based reinforcement learning (AMLQ) model. The AMLQ model quantizes the continuous actions and states to directly incorporate simple Q-learning to resolve the delay issue. This solution makes the training faster and enables simple knowledge representation without needing the DNN. For evaluation, the AMLQ model was compared with state-of-art approaches and was found to be superior in terms of root mean square error (RMSE), which was 8.7052 compared with the proportional-integral-derivative (PID) controller, which achieved an RMSE of 10.0592.