Regularization of deep neural network with batch contrastive loss
Neural networks have become deeper in recent years and this has improved its capacity to handle more complex tasks. However, deep neural network has more parameters and is easier to overfit, especially when training samples are insufficient. In this paper, we present a new regularization technique c...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Published: |
Institute of Electrical and Electronics Engineers
2021
|
Subjects: | |
Online Access: | http://eprints.um.edu.my/28105/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.um.eprints.28105 |
---|---|
record_format |
eprints |
spelling |
my.um.eprints.281052022-07-25T04:06:35Z http://eprints.um.edu.my/28105/ Regularization of deep neural network with batch contrastive loss Tanveer, Muhammad Tan, Hung-Khoon Ng, Hui-Fuang Leung, Maylor Karhang Chuah, Joon Huang QA75 Electronic computers. Computer science TA Engineering (General). Civil engineering (General) Neural networks have become deeper in recent years and this has improved its capacity to handle more complex tasks. However, deep neural network has more parameters and is easier to overfit, especially when training samples are insufficient. In this paper, we present a new regularization technique called batch contrastive regularization to improve generalization performance. The loss function is based on contrastive loss which enforces intra-class compactness and inter-class separability of batch samples. We explore three different contrastive losses: (1) the center contrastive loss which regularizes based on distances between data points and their corresponding class centroid, (2) the sample contrastive loss which is based on batch sample-pair distances, and (3) the multicenter loss which is similar to center contrastive loss except that the cluster centers are discovered from training. The proposed network has two heads, one for classification and the other for regularization. The regularization head is discarded during inference. We also introduce bag sampling to ensure that all classes in a batch are well represented. The performance of the proposed architecture is evaluated on the CIFAR-10 and CIFAR-100 datasets. Our experiments show that network regularized by batch contrastive loss display impressive generalization performance over a wide variety of classes, yielding more than 11% improvement for ResNet50 on CIFAR-100 when trained from scratch. Institute of Electrical and Electronics Engineers 2021 Article PeerReviewed Tanveer, Muhammad and Tan, Hung-Khoon and Ng, Hui-Fuang and Leung, Maylor Karhang and Chuah, Joon Huang (2021) Regularization of deep neural network with batch contrastive loss. IEEE Access, 9. pp. 124409-124418. ISSN 2169-3536, DOI https://doi.org/10.1109/ACCESS.2021.3110286 <https://doi.org/10.1109/ACCESS.2021.3110286>. 10.1109/ACCESS.2021.3110286 |
institution |
Universiti Malaya |
building |
UM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Malaya |
content_source |
UM Research Repository |
url_provider |
http://eprints.um.edu.my/ |
topic |
QA75 Electronic computers. Computer science TA Engineering (General). Civil engineering (General) |
spellingShingle |
QA75 Electronic computers. Computer science TA Engineering (General). Civil engineering (General) Tanveer, Muhammad Tan, Hung-Khoon Ng, Hui-Fuang Leung, Maylor Karhang Chuah, Joon Huang Regularization of deep neural network with batch contrastive loss |
description |
Neural networks have become deeper in recent years and this has improved its capacity to handle more complex tasks. However, deep neural network has more parameters and is easier to overfit, especially when training samples are insufficient. In this paper, we present a new regularization technique called batch contrastive regularization to improve generalization performance. The loss function is based on contrastive loss which enforces intra-class compactness and inter-class separability of batch samples. We explore three different contrastive losses: (1) the center contrastive loss which regularizes based on distances between data points and their corresponding class centroid, (2) the sample contrastive loss which is based on batch sample-pair distances, and (3) the multicenter loss which is similar to center contrastive loss except that the cluster centers are discovered from training. The proposed network has two heads, one for classification and the other for regularization. The regularization head is discarded during inference. We also introduce bag sampling to ensure that all classes in a batch are well represented. The performance of the proposed architecture is evaluated on the CIFAR-10 and CIFAR-100 datasets. Our experiments show that network regularized by batch contrastive loss display impressive generalization performance over a wide variety of classes, yielding more than 11% improvement for ResNet50 on CIFAR-100 when trained from scratch. |
format |
Article |
author |
Tanveer, Muhammad Tan, Hung-Khoon Ng, Hui-Fuang Leung, Maylor Karhang Chuah, Joon Huang |
author_facet |
Tanveer, Muhammad Tan, Hung-Khoon Ng, Hui-Fuang Leung, Maylor Karhang Chuah, Joon Huang |
author_sort |
Tanveer, Muhammad |
title |
Regularization of deep neural network with batch contrastive loss |
title_short |
Regularization of deep neural network with batch contrastive loss |
title_full |
Regularization of deep neural network with batch contrastive loss |
title_fullStr |
Regularization of deep neural network with batch contrastive loss |
title_full_unstemmed |
Regularization of deep neural network with batch contrastive loss |
title_sort |
regularization of deep neural network with batch contrastive loss |
publisher |
Institute of Electrical and Electronics Engineers |
publishDate |
2021 |
url |
http://eprints.um.edu.my/28105/ |
_version_ |
1739828437351137280 |
score |
13.201949 |