Staff View: Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning

Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning

Activation functions are essential for deep learning methods to learn and perform complex tasks such as image classification. Rectified Linear Unit (ReLU) has been widely used and become the default activation function across the deep learning community since 2012. Although ReLU has been popular, ho...

Full description

Saved in:

Bibliographic Details
Main Authors:	Hock, Hung Chieng, Wahid, Noorhaniza, Ong, Pauline, Perla, Sai Raj Kishore
Format:	Article
Language:	English
Published:	Program Studi Teknik Informatika 2018
Subjects:	T Technology (General) TA1501-1820 Applied optics. Photonics
Online Access:	http://eprints.uthm.edu.my/5227/1/AJ%202020%20%28102%29.pdf http://eprints.uthm.edu.my/5227/ http://dx.doi.org/10.26555/ijain.v4i2.249
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.uthm.eprints.5227
record_format	eprints
spelling	my.uthm.eprints.52272022-01-06T07:30:45Z http://eprints.uthm.edu.my/5227/ Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning Hock, Hung Chieng Wahid, Noorhaniza Ong, Pauline Perla, Sai Raj Kishore T Technology (General) TA1501-1820 Applied optics. Photonics Activation functions are essential for deep learning methods to learn and perform complex tasks such as image classification. Rectified Linear Unit (ReLU) has been widely used and become the default activation function across the deep learning community since 2012. Although ReLU has been popular, however, the hard zero property of the ReLU has heavily hindering the negative values from propagating through the network. Consequently, the deep neural network has not been benefited from the negative representations. In this work, an activation function called Flatten-T Swish (FTS) that leverage the benefit of the negative values is proposed. To verify its performance, this study evaluates FTS with ReLU and several recent activation functions. Each activation function is trained using MNIST dataset on five different deep fully connected neural networks (DFNNs) with depth vary from five to eight layers. For a fair evaluation, all DFNNs are using the same configuration settings. Based on the experimental results, FTS with a threshold value, T=-0.20 has the best overall performance. As compared with ReLU, FTS (T=-0.20) improves MNIST classification accuracy by 0.13%, 0.70%, 0.67%, 1.07% and 1.15% on wider 5 layers, slimmer 5 layers, 6 layers, 7 layers and 8 layers DFNNs respectively. Apart from this, the study also noticed that FTS converges twice as fast as ReLU. Although there are other existing activation functions are also evaluated, this study elects ReLU as the baseline activation function. Program Studi Teknik Informatika 2018 Article PeerReviewed text en http://eprints.uthm.edu.my/5227/1/AJ%202020%20%28102%29.pdf Hock, Hung Chieng and Wahid, Noorhaniza and Ong, Pauline and Perla, Sai Raj Kishore (2018) Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning. International Journal of Advances in Intelligent Informatics, 4 (2). pp. 76-86. ISSN 2442-6571 http://dx.doi.org/10.26555/ijain.v4i2.249
institution	Universiti Tun Hussein Onn Malaysia
building	UTHM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Tun Hussein Onn Malaysia
content_source	UTHM Institutional Repository
url_provider	http://eprints.uthm.edu.my/
language	English
topic	T Technology (General) TA1501-1820 Applied optics. Photonics
spellingShingle	T Technology (General) TA1501-1820 Applied optics. Photonics Hock, Hung Chieng Wahid, Noorhaniza Ong, Pauline Perla, Sai Raj Kishore Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning
description	Activation functions are essential for deep learning methods to learn and perform complex tasks such as image classification. Rectified Linear Unit (ReLU) has been widely used and become the default activation function across the deep learning community since 2012. Although ReLU has been popular, however, the hard zero property of the ReLU has heavily hindering the negative values from propagating through the network. Consequently, the deep neural network has not been benefited from the negative representations. In this work, an activation function called Flatten-T Swish (FTS) that leverage the benefit of the negative values is proposed. To verify its performance, this study evaluates FTS with ReLU and several recent activation functions. Each activation function is trained using MNIST dataset on five different deep fully connected neural networks (DFNNs) with depth vary from five to eight layers. For a fair evaluation, all DFNNs are using the same configuration settings. Based on the experimental results, FTS with a threshold value, T=-0.20 has the best overall performance. As compared with ReLU, FTS (T=-0.20) improves MNIST classification accuracy by 0.13%, 0.70%, 0.67%, 1.07% and 1.15% on wider 5 layers, slimmer 5 layers, 6 layers, 7 layers and 8 layers DFNNs respectively. Apart from this, the study also noticed that FTS converges twice as fast as ReLU. Although there are other existing activation functions are also evaluated, this study elects ReLU as the baseline activation function.
format	Article
author	Hock, Hung Chieng Wahid, Noorhaniza Ong, Pauline Perla, Sai Raj Kishore
author_facet	Hock, Hung Chieng Wahid, Noorhaniza Ong, Pauline Perla, Sai Raj Kishore
author_sort	Hock, Hung Chieng
title	Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning
title_short	Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning
title_full	Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning
title_fullStr	Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning
title_full_unstemmed	Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning
title_sort	flatten-t swish: a thresholded relu-swish-like activation function for deep learning
publisher	Program Studi Teknik Informatika
publishDate	2018
url	http://eprints.uthm.edu.my/5227/1/AJ%202020%20%28102%29.pdf http://eprints.uthm.edu.my/5227/ http://dx.doi.org/10.26555/ijain.v4i2.249
_version_	1738581353629220864
score	13.160551

Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning

Similar Items