Rotation Forest Ensemble Classifier to Improve the Cardiovascular Disease Risk Prediction Accuracy
Heart disease risk prediction is very important as it is one of the primary causes of sudden death in the world. Early-stage prediction can save the lives by undergoing appropriate diagnosis steps or making necessary changes in their lifestyles. Recent studies have focused on the use of data mining...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Conference or Workshop Item |
Published: |
Institute of Electrical and Electronics Engineers Inc.
2021
|
Online Access: | http://scholars.utp.edu.my/id/eprint/33452/ https://www.scopus.com/inward/record.uri?eid=2-s2.0-85126971116&doi=10.1109%2fNICS54270.2021.9701455&partnerID=40&md5=37e4cce6662becf0ffaf91d5fcea820c |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Heart disease risk prediction is very important as it is one of the primary causes of sudden death in the world. Early-stage prediction can save the lives by undergoing appropriate diagnosis steps or making necessary changes in their lifestyles. Recent studies have focused on the use of data mining and machine learning in the detection of diseases based on specific features of a person. The Rotation Forest, a tree-based ensemble classifier that uses Principal Component Analysis for feature extraction, is proposed to improve the prediction accuracy of heart disease risk. The Statlog heart dataset has been selected from the publicly available UCI machine learning repository in this research work. The dataset was trained with a Rotation Forest ensemble classifier with default base classifier J48, and then, Random Forest on full features and selected features obtained from One Rule and Support Vector Machines attribute evaluators. The performance of the Rotation Forest was compared with the standard machine learning classifiers, Naïve Bayes, Logistic Regression, Support Vector Machines, K-Nearest Neighbors, AdaBoostM1, and Bagging. The Rotation Forest algorithm with Random Forest provided the highest accuracy of 94.44 and area under the ROC curve 0.980 on selected features of the Statlog dataset from the One Rule method. © 2021 IEEE. |
---|