Staff View: An efficient semi-sigmoidal non-linear activation function approach for deep neural networks

An efficient semi-sigmoidal non-linear activation function approach for deep neural networks

A non-linear activation function is one of the key contributing factors to the success of Deep Learning (DL). Since the revival of DL takes place in 2012, Rectified Linear Unit (ReLU) has been regarded as a de facto standard for many DL models by the community. Despite its popularity, however, Re...

Full description

Saved in:

Bibliographic Details
Main Author:	Chieng, Hock Hung
Format:	Thesis
Language:	English English English
Published:	2022
Subjects:	QA76 Computer software
Online Access:	http://eprints.uthm.edu.my/8409/1/24p%20CHIENG%20HOCK%20HUNG.pdf http://eprints.uthm.edu.my/8409/2/CHIENG%20HOCK%20HUNG%20COPYRIGHT%20DECLARATION.pdf http://eprints.uthm.edu.my/8409/3/CHIENG%20HOCK%20HUNG%20WATERMARK.pdf http://eprints.uthm.edu.my/8409/
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.uthm.eprints.8409
record_format	eprints
spelling	my.uthm.eprints.84092023-02-26T07:06:44Z http://eprints.uthm.edu.my/8409/ An efficient semi-sigmoidal non-linear activation function approach for deep neural networks Chieng, Hock Hung QA76 Computer software A non-linear activation function is one of the key contributing factors to the success of Deep Learning (DL). Since the revival of DL takes place in 2012, Rectified Linear Unit (ReLU) has been regarded as a de facto standard for many DL models by the community. Despite its popularity, however, ReLU contains several shortcomings that could result in inefficient learning of the DL models. These shortcomings are: 1) the inherent negative cancellation property in ReLU tends to remove all negative inputs and causes massive information lost to the network; 2) the derivative of ReLU potentially causes the occurrence of dead neurons problem to the networks; 3) the mean activation generated by ReLU is highly positive and lead to bias shift effect in the network layers; 4) the inherent multilinear structure of ReLU restricts the nonlinear capability of the networks; 5) the predefined nature of ReLU limits the flexibility of the networks. To address these shortcomings, this study proposed a new variant of activation function based on the Semi-sigmoidal (Sig) approach. Based on this approach, three variants of activation functions are introduced, namely, Shifted Semisigmoidal (SSig), Adaptive Shifted Semi-sigmoidal (ASSig), and Bi-directional Adaptive Shifted Semi-sigmoidal (BiASSig). The proposed activation functions were tested against the ReLU (baseline) and state-of-the-art methods using eight Deep Neural Networks (DNNs) on seven benchmark image datasets. Further, Adaptive Moment Estimation (ADAM) and Stochastic Gradient Descent (SGD) were selected as optimizers to train the DNNs. The baseline comparison score and mean rank were used to consolidate and analyse the experimental results effectively. The experimental results in terms of the overall baseline comparison score shown that SSig, ASSig, and BiASSig obtained the score of 79, 87, and 86 out of 112, respectively, which achieving outstanding performance than ReLU in more than 70% of the cases. In terms of overall mean rank (OMR), ReLU ranked at tenth (10th), whereas SSig, ASSig, and BiASSig ranked at fifth (5th), first (1st), and second (2nd), showing remarkable performance than ReLU and other comparing methods. 2022-01 Thesis NonPeerReviewed text en http://eprints.uthm.edu.my/8409/1/24p%20CHIENG%20HOCK%20HUNG.pdf text en http://eprints.uthm.edu.my/8409/2/CHIENG%20HOCK%20HUNG%20COPYRIGHT%20DECLARATION.pdf text en http://eprints.uthm.edu.my/8409/3/CHIENG%20HOCK%20HUNG%20WATERMARK.pdf Chieng, Hock Hung (2022) An efficient semi-sigmoidal non-linear activation function approach for deep neural networks. Doctoral thesis, Universiti Tun Hussein Onn Malaysia.
institution	Universiti Tun Hussein Onn Malaysia
building	UTHM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Tun Hussein Onn Malaysia
content_source	UTHM Institutional Repository
url_provider	http://eprints.uthm.edu.my/
language	English English English
topic	QA76 Computer software
spellingShingle	QA76 Computer software Chieng, Hock Hung An efficient semi-sigmoidal non-linear activation function approach for deep neural networks
description	A non-linear activation function is one of the key contributing factors to the success of Deep Learning (DL). Since the revival of DL takes place in 2012, Rectified Linear Unit (ReLU) has been regarded as a de facto standard for many DL models by the community. Despite its popularity, however, ReLU contains several shortcomings that could result in inefficient learning of the DL models. These shortcomings are: 1) the inherent negative cancellation property in ReLU tends to remove all negative inputs and causes massive information lost to the network; 2) the derivative of ReLU potentially causes the occurrence of dead neurons problem to the networks; 3) the mean activation generated by ReLU is highly positive and lead to bias shift effect in the network layers; 4) the inherent multilinear structure of ReLU restricts the nonlinear capability of the networks; 5) the predefined nature of ReLU limits the flexibility of the networks. To address these shortcomings, this study proposed a new variant of activation function based on the Semi-sigmoidal (Sig) approach. Based on this approach, three variants of activation functions are introduced, namely, Shifted Semisigmoidal (SSig), Adaptive Shifted Semi-sigmoidal (ASSig), and Bi-directional Adaptive Shifted Semi-sigmoidal (BiASSig). The proposed activation functions were tested against the ReLU (baseline) and state-of-the-art methods using eight Deep Neural Networks (DNNs) on seven benchmark image datasets. Further, Adaptive Moment Estimation (ADAM) and Stochastic Gradient Descent (SGD) were selected as optimizers to train the DNNs. The baseline comparison score and mean rank were used to consolidate and analyse the experimental results effectively. The experimental results in terms of the overall baseline comparison score shown that SSig, ASSig, and BiASSig obtained the score of 79, 87, and 86 out of 112, respectively, which achieving outstanding performance than ReLU in more than 70% of the cases. In terms of overall mean rank (OMR), ReLU ranked at tenth (10th), whereas SSig, ASSig, and BiASSig ranked at fifth (5th), first (1st), and second (2nd), showing remarkable performance than ReLU and other comparing methods.
format	Thesis
author	Chieng, Hock Hung
author_facet	Chieng, Hock Hung
author_sort	Chieng, Hock Hung
title	An efficient semi-sigmoidal non-linear activation function approach for deep neural networks
title_short	An efficient semi-sigmoidal non-linear activation function approach for deep neural networks
title_full	An efficient semi-sigmoidal non-linear activation function approach for deep neural networks
title_fullStr	An efficient semi-sigmoidal non-linear activation function approach for deep neural networks
title_full_unstemmed	An efficient semi-sigmoidal non-linear activation function approach for deep neural networks
title_sort	efficient semi-sigmoidal non-linear activation function approach for deep neural networks
publishDate	2022
url	http://eprints.uthm.edu.my/8409/1/24p%20CHIENG%20HOCK%20HUNG.pdf http://eprints.uthm.edu.my/8409/2/CHIENG%20HOCK%20HUNG%20COPYRIGHT%20DECLARATION.pdf http://eprints.uthm.edu.my/8409/3/CHIENG%20HOCK%20HUNG%20WATERMARK.pdf http://eprints.uthm.edu.my/8409/
_version_	1758952403904757760
score	13.160551

An efficient semi-sigmoidal non-linear activation function approach for deep neural networks

Similar Items