Optimizing exploration parameter in dueling deep Q-networks for complex gaming environment

Reinforcement Learning is being used to solve various tasks. A Complex Environment is a recent problem at hand for Reinforcement Learning, which employs an Agent who interacts with the surroundings and learns to solve whatever task has to be done. To solve a Complex Environment efficiently using a R...

Full description

Saved in:
Bibliographic Details
Main Author: Khan, Muhammad Shehryar
Format: Thesis
Language:English
Published: 2019
Subjects:
Online Access:http://eprints.utm.my/id/eprint/96670/1/MuhammadShehryarKhanMSC2019.pdf.pdf
http://eprints.utm.my/id/eprint/96670/
http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:143073
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utm.96670
record_format eprints
spelling my.utm.966702022-08-15T08:30:08Z http://eprints.utm.my/id/eprint/96670/ Optimizing exploration parameter in dueling deep Q-networks for complex gaming environment Khan, Muhammad Shehryar QA75 Electronic computers. Computer science Reinforcement Learning is being used to solve various tasks. A Complex Environment is a recent problem at hand for Reinforcement Learning, which employs an Agent who interacts with the surroundings and learns to solve whatever task has to be done. To solve a Complex Environment efficiently using a Reinforcement Learning Agent, a lot of parameters are to be kept in perspective. Every action that the Agent takes has a consequence in the form of a Reward Function. Based on the value of this Reward Function, our Agent develops a Policy to solve the Environment. The Policy is generally developed to maximize the Reward Functions. The Optimal Policy employs an Exploration Strategy which is used by the Agent. Reinforcement Learning Architectures are relying on the Policy and Exploration Strategy of the Agent to solve the Environment efficiently. This research is based upon two parts. Firstly, the optimization of a Deep Reinforcement Learning Architecture “Dueling Deep Q-Network” is conducted by improving its Exploration strategy. It combines a recent and novel Exploration technique, Curiosity Driven Intrinsic Motivation, with the Dueling DQN. The performance of this Curious Dueling DQN is checked by comparing it with the existing Dueling DQN. Secondly, the performance of the Curious Dueling DQN is validated against Noisy Dueling DQN, a combination of Dueling DQN with another recent exploration strategy called Noisy Nets, hence, finding an optimal exploration strategy. The performance of both solutions is evaluated in the environment of Super Mario Bros based on Mean Score and Estimation Loss. The proposed model improves the Mean Score by 3 folds, while the loss is increased by 28%. 2019 Thesis NonPeerReviewed application/pdf en http://eprints.utm.my/id/eprint/96670/1/MuhammadShehryarKhanMSC2019.pdf.pdf Khan, Muhammad Shehryar (2019) Optimizing exploration parameter in dueling deep Q-networks for complex gaming environment. Masters thesis, Universiti Teknologi Malaysia. http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:143073
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Khan, Muhammad Shehryar
Optimizing exploration parameter in dueling deep Q-networks for complex gaming environment
description Reinforcement Learning is being used to solve various tasks. A Complex Environment is a recent problem at hand for Reinforcement Learning, which employs an Agent who interacts with the surroundings and learns to solve whatever task has to be done. To solve a Complex Environment efficiently using a Reinforcement Learning Agent, a lot of parameters are to be kept in perspective. Every action that the Agent takes has a consequence in the form of a Reward Function. Based on the value of this Reward Function, our Agent develops a Policy to solve the Environment. The Policy is generally developed to maximize the Reward Functions. The Optimal Policy employs an Exploration Strategy which is used by the Agent. Reinforcement Learning Architectures are relying on the Policy and Exploration Strategy of the Agent to solve the Environment efficiently. This research is based upon two parts. Firstly, the optimization of a Deep Reinforcement Learning Architecture “Dueling Deep Q-Network” is conducted by improving its Exploration strategy. It combines a recent and novel Exploration technique, Curiosity Driven Intrinsic Motivation, with the Dueling DQN. The performance of this Curious Dueling DQN is checked by comparing it with the existing Dueling DQN. Secondly, the performance of the Curious Dueling DQN is validated against Noisy Dueling DQN, a combination of Dueling DQN with another recent exploration strategy called Noisy Nets, hence, finding an optimal exploration strategy. The performance of both solutions is evaluated in the environment of Super Mario Bros based on Mean Score and Estimation Loss. The proposed model improves the Mean Score by 3 folds, while the loss is increased by 28%.
format Thesis
author Khan, Muhammad Shehryar
author_facet Khan, Muhammad Shehryar
author_sort Khan, Muhammad Shehryar
title Optimizing exploration parameter in dueling deep Q-networks for complex gaming environment
title_short Optimizing exploration parameter in dueling deep Q-networks for complex gaming environment
title_full Optimizing exploration parameter in dueling deep Q-networks for complex gaming environment
title_fullStr Optimizing exploration parameter in dueling deep Q-networks for complex gaming environment
title_full_unstemmed Optimizing exploration parameter in dueling deep Q-networks for complex gaming environment
title_sort optimizing exploration parameter in dueling deep q-networks for complex gaming environment
publishDate 2019
url http://eprints.utm.my/id/eprint/96670/1/MuhammadShehryarKhanMSC2019.pdf.pdf
http://eprints.utm.my/id/eprint/96670/
http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:143073
_version_ 1743107011659169792
score 13.214268