A comparative study of policies in Q-learning for foraging tasks

Q-learning is a machine learning technique that learns what to do and how to map states to actions to maximize rewards. Q-learning has been applied to various tasks such as foraging, soccer and prey-pursuing robots. In this paper, a simple foraging task has been considered to study the influences of...

Full description

Saved in:
Bibliographic Details
Main Authors: Mohan, Y., Ponnambalam, S.G., Inayat-Hussain, J.I.
Format:
Published: 2017
Online Access:http://dspace.uniten.edu.my/jspui/handle/123456789/6522
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Q-learning is a machine learning technique that learns what to do and how to map states to actions to maximize rewards. Q-learning has been applied to various tasks such as foraging, soccer and prey-pursuing robots. In this paper, a simple foraging task has been considered to study the influences of the policies reported in the open literatures. A mobile robot is used to search and retrieve pucks back to a home location. The goal of this study is to identify an efficient policy for q-learning which maximizes the number of pucks collected and minimizes the number of collisions in the environment. Policies namely greedy, epsilon-greedy, Boltzmann distribution and random search are used to study their performances in the foraging task and the results are presented. ©2009 IEEE.