Staff View: Comparison of imbalanced data treatments: a case study on cleft lip and palate data

Comparison of imbalanced data treatments: a case study on cleft lip and palate data

This study was conducted to investigate if the resampling and the penalized approaches of balancing a small and imbalance data would improve the classification model produces by random forests learning algorithm on a small and imbalanced Cleft Lip and Palate (CLP) patients’ dataset. Comparison betwe...

Full description

Saved in:

Bibliographic Details
Main Authors:	Zaturrawiah Ali Omar, Chin, Su Na, Siti Rahayu Mohd. Hashim, Norhafiza Hamzah
Format:	Proceedings
Language:	English English
Published:	Faculty of Science and Natural Resources 2020
Subjects:	QA Mathematics SD Forestry
Online Access:	https://eprints.ums.edu.my/id/eprint/21431/1/Comparison%20of%20imbalanced%20data%20treatments.pdf https://eprints.ums.edu.my/id/eprint/21431/2/Comparison%20of%20imbalanced%20data%20treatments1.pdf https://eprints.ums.edu.my/id/eprint/21431/ https://www.ums.edu.my/fssa/wp-content/uploads/2020/12/PROCEEDINGS-BOOK-ST-2020-e-ISSN.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.ums.eprints.21431
record_format	eprints
spelling	my.ums.eprints.214312021-06-17T02:32:21Z https://eprints.ums.edu.my/id/eprint/21431/ Comparison of imbalanced data treatments: a case study on cleft lip and palate data Zaturrawiah Ali Omar Chin, Su Na Siti Rahayu Mohd. Hashim Norhafiza Hamzah QA Mathematics SD Forestry This study was conducted to investigate if the resampling and the penalized approaches of balancing a small and imbalance data would improve the classification model produces by random forests learning algorithm on a small and imbalanced Cleft Lip and Palate (CLP) patients’ dataset. Comparison between a Balanced Random Forest (BRF), Synthetic Minority Over-sampling Technique (SMOTE) on Random Forests (RF) and Weighted Random Forest (WRF) were then conducted on the CLP dataset and results were compared using the area under the curve (AUC) and the tradeoff between Sensitivity and Specificity. The results showed no difference in predictive ability between untreated (RF), oversampling (SMOTE+RF) and penalty treatment (WRF) but poor performances of the downsampling treatment (BRF). It was observed that the small number of training and test sample size had attributed to the results obtained and severely affect the performance of the classifier used for each treatment. The SMOTE+RF oversampling method, however, demonstrated to be promising for the CLP dataset. Faculty of Science and Natural Resources 2020 Proceedings PeerReviewed text en https://eprints.ums.edu.my/id/eprint/21431/1/Comparison%20of%20imbalanced%20data%20treatments.pdf text en https://eprints.ums.edu.my/id/eprint/21431/2/Comparison%20of%20imbalanced%20data%20treatments1.pdf Zaturrawiah Ali Omar and Chin, Su Na and Siti Rahayu Mohd. Hashim and Norhafiza Hamzah (2020) Comparison of imbalanced data treatments: a case study on cleft lip and palate data. https://www.ums.edu.my/fssa/wp-content/uploads/2020/12/PROCEEDINGS-BOOK-ST-2020-e-ISSN.pdf
institution	Universiti Malaysia Sabah
building	UMS Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Malaysia Sabah
content_source	UMS Institutional Repository
url_provider	http://eprints.ums.edu.my/
language	English English
topic	QA Mathematics SD Forestry
spellingShingle	QA Mathematics SD Forestry Zaturrawiah Ali Omar Chin, Su Na Siti Rahayu Mohd. Hashim Norhafiza Hamzah Comparison of imbalanced data treatments: a case study on cleft lip and palate data
description	This study was conducted to investigate if the resampling and the penalized approaches of balancing a small and imbalance data would improve the classification model produces by random forests learning algorithm on a small and imbalanced Cleft Lip and Palate (CLP) patients’ dataset. Comparison between a Balanced Random Forest (BRF), Synthetic Minority Over-sampling Technique (SMOTE) on Random Forests (RF) and Weighted Random Forest (WRF) were then conducted on the CLP dataset and results were compared using the area under the curve (AUC) and the tradeoff between Sensitivity and Specificity. The results showed no difference in predictive ability between untreated (RF), oversampling (SMOTE+RF) and penalty treatment (WRF) but poor performances of the downsampling treatment (BRF). It was observed that the small number of training and test sample size had attributed to the results obtained and severely affect the performance of the classifier used for each treatment. The SMOTE+RF oversampling method, however, demonstrated to be promising for the CLP dataset.
format	Proceedings
author	Zaturrawiah Ali Omar Chin, Su Na Siti Rahayu Mohd. Hashim Norhafiza Hamzah
author_facet	Zaturrawiah Ali Omar Chin, Su Na Siti Rahayu Mohd. Hashim Norhafiza Hamzah
author_sort	Zaturrawiah Ali Omar
title	Comparison of imbalanced data treatments: a case study on cleft lip and palate data
title_short	Comparison of imbalanced data treatments: a case study on cleft lip and palate data
title_full	Comparison of imbalanced data treatments: a case study on cleft lip and palate data
title_fullStr	Comparison of imbalanced data treatments: a case study on cleft lip and palate data
title_full_unstemmed	Comparison of imbalanced data treatments: a case study on cleft lip and palate data
title_sort	comparison of imbalanced data treatments: a case study on cleft lip and palate data
publisher	Faculty of Science and Natural Resources
publishDate	2020
url	https://eprints.ums.edu.my/id/eprint/21431/1/Comparison%20of%20imbalanced%20data%20treatments.pdf https://eprints.ums.edu.my/id/eprint/21431/2/Comparison%20of%20imbalanced%20data%20treatments1.pdf https://eprints.ums.edu.my/id/eprint/21431/ https://www.ums.edu.my/fssa/wp-content/uploads/2020/12/PROCEEDINGS-BOOK-ST-2020-e-ISSN.pdf
_version_	1760229842184306688
score	13.189132

Comparison of imbalanced data treatments: a case study on cleft lip and palate data

Similar Items