Staff View: Boosting and bagging classification for computer science journal

Boosting and bagging classification for computer science journal

In recent years, data processing has become an issue across all disciplines. Good data processing can provide decision-making recommendations. Data processing is covered in academic data processing publications, including those in computer science. This topic has grown over the past three years,...

Full description

Saved in:

Bibliographic Details
Main Authors:	Wibawa, Aji Prasetya, Putri, Nastiti Susetyo Fanany, Al Rasyid, Harits, Nafalski, Andrew, Hashim, Ummi Rabaah
Format:	Article
Language:	English
Published:	Universitas Ahmad Dahlan 2023
Online Access:	http://eprints.utem.edu.my/id/eprint/27372/2/0167813062023209.PDF http://eprints.utem.edu.my/id/eprint/27372/ https://ijain.org/index.php/IJAIN/article/view/985 https://doi.org/10.26555/ijain.v9i1.985
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.utem.eprints.27372
record_format	eprints
spelling	my.utem.eprints.273722024-07-25T09:43:10Z http://eprints.utem.edu.my/id/eprint/27372/ Boosting and bagging classification for computer science journal Wibawa, Aji Prasetya Putri, Nastiti Susetyo Fanany Al Rasyid, Harits Nafalski, Andrew Hashim, Ummi Rabaah In recent years, data processing has become an issue across all disciplines. Good data processing can provide decision-making recommendations. Data processing is covered in academic data processing publications, including those in computer science. This topic has grown over the past three years, demonstrating that data processing is expanding and diversifying, and there is a great deal of interest in this area of study. Within the journal, groupings (quartiles) indicate the journal's influence on other similar studies. SCImago provides this category. There are four quartiles, with the highest quartile being 1 and the lowest being 4. There are, however, numerous differences in class quartiles, with different quartile values for the same journal in different disciplines. Therefore, a method of categorization is provided to solve this issue. Classification is a machine-learning technique that groups data based on the supplied label class. Ensemble Boosting and Bagging with Decision Tree (DT) and Gaussian Nave Bayes (GNB) were utilized in this study. Several modifications were made to the ensemble algorithm's depth and estimator settings to examine the influence of adding values on the resultant precision. In the DT algorithm, both variables are altered, whereas, in the GNB algorithm, just the estimator's value is modified. Based on the average value of the accuracy results, it is known that the best algorithm for computer science datasets is GNB Bagging, with values of 68.96%, 70.99%, and 69.05%. Second-place XGBDT has 67.75% accuracy, 67.69% precision, and 67.83 recall. The DT Bagging method placed third with 67.31 percent recall, 68.13 percent precision, and 67.30 percent accuracy. The fourth sequence is the XGBoost GNB approach, which has an accuracy of 67.07%, a precision of 68.85%, and a recall of 67.18%. The Adaboost DT technique ranks in the fifth position with an accuracy of 63.65%, a precision of 64.21 %, and a recall of 63.63 %. Adaboost GNB is the least efficient algorithm for this dataset since it only achieves 43.19 % accuracy, 48.14 % precision, and 43.2% recall. The results are still quite far from the ideal. Hence the proposed method for journal quartile inequality issues is not advised. Universitas Ahmad Dahlan 2023-03 Article PeerReviewed text en http://eprints.utem.edu.my/id/eprint/27372/2/0167813062023209.PDF Wibawa, Aji Prasetya and Putri, Nastiti Susetyo Fanany and Al Rasyid, Harits and Nafalski, Andrew and Hashim, Ummi Rabaah (2023) Boosting and bagging classification for computer science journal. International Journal of Advances in Intelligent Informatics, 9 (1). pp. 27-38. ISSN 2442-6571 https://ijain.org/index.php/IJAIN/article/view/985 https://doi.org/10.26555/ijain.v9i1.985
institution	Universiti Teknikal Malaysia Melaka
building	UTEM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Teknikal Malaysia Melaka
content_source	UTEM Institutional Repository
url_provider	http://eprints.utem.edu.my/
language	English
description	In recent years, data processing has become an issue across all disciplines. Good data processing can provide decision-making recommendations. Data processing is covered in academic data processing publications, including those in computer science. This topic has grown over the past three years, demonstrating that data processing is expanding and diversifying, and there is a great deal of interest in this area of study. Within the journal, groupings (quartiles) indicate the journal's influence on other similar studies. SCImago provides this category. There are four quartiles, with the highest quartile being 1 and the lowest being 4. There are, however, numerous differences in class quartiles, with different quartile values for the same journal in different disciplines. Therefore, a method of categorization is provided to solve this issue. Classification is a machine-learning technique that groups data based on the supplied label class. Ensemble Boosting and Bagging with Decision Tree (DT) and Gaussian Nave Bayes (GNB) were utilized in this study. Several modifications were made to the ensemble algorithm's depth and estimator settings to examine the influence of adding values on the resultant precision. In the DT algorithm, both variables are altered, whereas, in the GNB algorithm, just the estimator's value is modified. Based on the average value of the accuracy results, it is known that the best algorithm for computer science datasets is GNB Bagging, with values of 68.96%, 70.99%, and 69.05%. Second-place XGBDT has 67.75% accuracy, 67.69% precision, and 67.83 recall. The DT Bagging method placed third with 67.31 percent recall, 68.13 percent precision, and 67.30 percent accuracy. The fourth sequence is the XGBoost GNB approach, which has an accuracy of 67.07%, a precision of 68.85%, and a recall of 67.18%. The Adaboost DT technique ranks in the fifth position with an accuracy of 63.65%, a precision of 64.21 %, and a recall of 63.63 %. Adaboost GNB is the least efficient algorithm for this dataset since it only achieves 43.19 % accuracy, 48.14 % precision, and 43.2% recall. The results are still quite far from the ideal. Hence the proposed method for journal quartile inequality issues is not advised.
format	Article
author	Wibawa, Aji Prasetya Putri, Nastiti Susetyo Fanany Al Rasyid, Harits Nafalski, Andrew Hashim, Ummi Rabaah
spellingShingle	Wibawa, Aji Prasetya Putri, Nastiti Susetyo Fanany Al Rasyid, Harits Nafalski, Andrew Hashim, Ummi Rabaah Boosting and bagging classification for computer science journal
author_facet	Wibawa, Aji Prasetya Putri, Nastiti Susetyo Fanany Al Rasyid, Harits Nafalski, Andrew Hashim, Ummi Rabaah
author_sort	Wibawa, Aji Prasetya
title	Boosting and bagging classification for computer science journal
title_short	Boosting and bagging classification for computer science journal
title_full	Boosting and bagging classification for computer science journal
title_fullStr	Boosting and bagging classification for computer science journal
title_full_unstemmed	Boosting and bagging classification for computer science journal
title_sort	boosting and bagging classification for computer science journal
publisher	Universitas Ahmad Dahlan
publishDate	2023
url	http://eprints.utem.edu.my/id/eprint/27372/2/0167813062023209.PDF http://eprints.utem.edu.my/id/eprint/27372/ https://ijain.org/index.php/IJAIN/article/view/985 https://doi.org/10.26555/ijain.v9i1.985
_version_	1806429018809958400
score	13.211869

Boosting and bagging classification for computer science journal

Similar Items