Modeling of cardiovascular diseases (CVDs) and development of predictive heart risk score

Cardiovascular diseases (CVDs) are the leading cause of death, with 31% of global mortality. The purpose of this study is two folds such as the development of a statistically valid path model which considered the possible non-linear paths, mediators, and binary endogenous feature of CVDs status. Fur...

Full description

Saved in:
Bibliographic Details
Main Author: Mirza Rizwan, Sajid
Format: Thesis
Language:English
Published: 2021
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/34613/1/Modeling%20of%20cardiovascular%20diseases%20%28CVDs%29%20and%20development%20of%20predictive%20heart%20risk%20score.pdf
http://umpir.ump.edu.my/id/eprint/34613/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Cardiovascular diseases (CVDs) are the leading cause of death, with 31% of global mortality. The purpose of this study is two folds such as the development of a statistically valid path model which considered the possible non-linear paths, mediators, and binary endogenous feature of CVDs status. Further, it focuses on the development of various forms of local risk prediction models and simple heart risk scores using non-laboratory features and machine learning (ML) algorithms. However, the conversion of a complex form of ML algorithms into a simple statistical model is the prime concern. A gendermatched case-control study was conducted in Punjab Institute of Cardiology, Pakistan, in which a sample of 460 individuals was selected through systematic sampling. The warppartial least square method was utilized to estimate the multi-layer hypothesized path model. This model estimated warped coefficients using the overall linear trend found in linear segments of non-linear relationships. This model found novel pathways in which demographic and socioeconomic features are the main drivers of behavioral features, leading to CVDs status directly and indirectly through metabolic syndrome. In developing risk prediction models, two ML algorithms, linear support vector machine and artificial neural network outperformed the existing conventional logistic regression analysis (LRA) model. The performance of the models was assessed through various established matrices using 10-fold cross-validation. A novel methodology was used to compute simple heart risk scores called non-laboratory based heart risk score (NLHRS). The methodology is proposed as stacking ensemble ML and the best ML algorithms are used as a base learner to compute relative feature weights. The index of these weights is referred to as NLHRS, which was further used as a covariate in the simple LRA model to estimate the likelihood of CVDs. This conversion from a complex black-box nature of ML algorithms into simple statistical models yielded such models, which do not require automated systems for their implementation. ML-based NLHRS and their associated models outperformed the existing semi-quantitative risk score-based model in terms of discrimination and calibration assessments. Finally, the predictive capability of valid NLHRS models has also been tested and adjusted for different strata of the population. Firstly, the study concludes that the adoptions of the flexible approach in estimation can model the binary feature of CVDs and non-linear paths in the complex path models. The estimated CVDs path model can be implemented as a disease delay strategy in clinical settings. Secondly, the ML models offer better and consistent risk prediction models as compared to LRA-based model. The NLHRS and their associated models which are the outputs of novel methodology provide valid and simple forms of risk scores and can be used without automated systems.