Staff View: Bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems

Bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems

This paper focuses on the enhancement of the generalization ability and training stability of deep neural networks (DNNs). New activation functions that we call bounded rectified linear unit (ReLU), bounded leaky ReLU, and bounded bi-firing are proposed. These activation functions are defined based...

Full description

Saved in:

Bibliographic Details
Main Authors:	Liew, S. S., Khalil-Hani, M., Bakhteri, R.
Format:	Article
Published:	Elsevier B.V. 2016
Subjects:	TK Electrical engineering. Electronics Nuclear engineering
Online Access:	http://eprints.utm.my/id/eprint/71526/ https://www.scopus.com/inward/record.uri?eid=2-s2.0-84994477344&doi=10.1016%2fj.neucom.2016.08.037&partnerID=40&md5=5b940413f14332dd63cda37f4ebfbe4b
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.utm.71526
record_format	eprints
spelling	my.utm.715262017-11-14T07:00:33Z http://eprints.utm.my/id/eprint/71526/ Bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems Liew, S. S. Khalil-Hani, M. Bakhteri, R. TK Electrical engineering. Electronics Nuclear engineering This paper focuses on the enhancement of the generalization ability and training stability of deep neural networks (DNNs). New activation functions that we call bounded rectified linear unit (ReLU), bounded leaky ReLU, and bounded bi-firing are proposed. These activation functions are defined based on the desired properties of the universal approximation theorem (UAT). An additional work on providing a new set of coefficient values for the scaled hyperbolic tangent function is also presented. These works result in improved classification performances and training stability in DNNs. Experimental works using the multilayer perceptron (MLP) and convolutional neural network (CNN) models have shown that the proposed activation functions outperforms their respective original forms in regards to the classification accuracies and numerical stability. Tests on MNIST, mnist-rot-bg-img handwritten digit, and AR Purdue face databases show that significant improvements of 17.31%, 9.19%, and 74.99% can be achieved in terms of the testing misclassification error rates (MCRs), applying both mean squared error (MSE) and cross-entropy (CE) loss functions This is done without sacrificing the computational efficiency. With the MNIST dataset, bounding the output of an activation function results in a 78.58% reduction in numerical instability, and with the mnist-rot-bg-img and AR Purdue databases the problem is completely eliminated. Thus, this work has demonstrated the significance of bounding an activation function in helping to alleviate the training instability problem when training a DNN model (particularly CNN). Elsevier B.V. 2016 Article PeerReviewed Liew, S. S. and Khalil-Hani, M. and Bakhteri, R. (2016) Bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems. Neurocomputing, 216 . pp. 718-734. ISSN 0925-2312 https://www.scopus.com/inward/record.uri?eid=2-s2.0-84994477344&doi=10.1016%2fj.neucom.2016.08.037&partnerID=40&md5=5b940413f14332dd63cda37f4ebfbe4b
institution	Universiti Teknologi Malaysia
building	UTM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Teknologi Malaysia
content_source	UTM Institutional Repository
url_provider	http://eprints.utm.my/
topic	TK Electrical engineering. Electronics Nuclear engineering
spellingShingle	TK Electrical engineering. Electronics Nuclear engineering Liew, S. S. Khalil-Hani, M. Bakhteri, R. Bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems
description	This paper focuses on the enhancement of the generalization ability and training stability of deep neural networks (DNNs). New activation functions that we call bounded rectified linear unit (ReLU), bounded leaky ReLU, and bounded bi-firing are proposed. These activation functions are defined based on the desired properties of the universal approximation theorem (UAT). An additional work on providing a new set of coefficient values for the scaled hyperbolic tangent function is also presented. These works result in improved classification performances and training stability in DNNs. Experimental works using the multilayer perceptron (MLP) and convolutional neural network (CNN) models have shown that the proposed activation functions outperforms their respective original forms in regards to the classification accuracies and numerical stability. Tests on MNIST, mnist-rot-bg-img handwritten digit, and AR Purdue face databases show that significant improvements of 17.31%, 9.19%, and 74.99% can be achieved in terms of the testing misclassification error rates (MCRs), applying both mean squared error (MSE) and cross-entropy (CE) loss functions This is done without sacrificing the computational efficiency. With the MNIST dataset, bounding the output of an activation function results in a 78.58% reduction in numerical instability, and with the mnist-rot-bg-img and AR Purdue databases the problem is completely eliminated. Thus, this work has demonstrated the significance of bounding an activation function in helping to alleviate the training instability problem when training a DNN model (particularly CNN).
format	Article
author	Liew, S. S. Khalil-Hani, M. Bakhteri, R.
author_facet	Liew, S. S. Khalil-Hani, M. Bakhteri, R.
author_sort	Liew, S. S.
title	Bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems
title_short	Bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems
title_full	Bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems
title_fullStr	Bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems
title_full_unstemmed	Bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems
title_sort	bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems
publisher	Elsevier B.V.
publishDate	2016
url	http://eprints.utm.my/id/eprint/71526/ https://www.scopus.com/inward/record.uri?eid=2-s2.0-84994477344&doi=10.1016%2fj.neucom.2016.08.037&partnerID=40&md5=5b940413f14332dd63cda37f4ebfbe4b
_version_	1643656206065074176
score	13.160551

Bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems

Similar Items