Traffic control strategy for adaptive signal controller based on reinforcement learning and local communication channel

This research study is in the field of deep reinforcement learning (DRL) adaptive controllers. The developed DRL controller is an off-policy, model-free agent based on the Q-learning algorithm. The research aims to address several issues found in the existing DRL work direction. Issues related to th...

Full description

Saved in:
Bibliographic Details
Main Author: Muaid, Abdulkareem Alnazir Ahmed
Format: Final Year Project / Dissertation / Thesis
Published: 2023
Subjects:
Online Access:http://eprints.utar.edu.my/6239/1/MUAID_ABDULKAREEM_ALNAZIR_AHMED.pdf
http://eprints.utar.edu.my/6239/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This research study is in the field of deep reinforcement learning (DRL) adaptive controllers. The developed DRL controller is an off-policy, model-free agent based on the Q-learning algorithm. The research aims to address several issues found in the existing DRL work direction. Issues related to the ability of the DRL agent to mitigate signal operation under various traffic flow conditions, the extension of the model environment in the development process of the DRL agent, the under-representation and simplification of traffic dynamics, the utilisation of futuristic communication technology, and the ability of the DRL system to mitigate signalised junctions in an arterial network are pressing challenges for intelligent signal systems. An innovative control strategy is proposed to make the single system design efficient for global optimisation at network-level operation. The introduced downstream policy adapts the signal operation to the available capacity at discharge routes. An illustrative case study tests and evaluates the proposed control system. The micro-model simulated stochastic and dynamic traffic elements to represent the actual traffic. The rigorous tests showed that the proposed controller achieved the closest optimal flow condition at 0.80 for the network operation and outperformed other controllers in reducing waiting time costs (10%-36%), improving travel time experiences (5%−25%), and constituting the highest mean travel speed (3.4 m/s).