Optimized support vector regression predicting treatment duration among tuberculosis patients in Malaysia

Machine learning models have emerged as an advanced tool for predicting diseases and their outcomes. This study developed a machine learning model to predict the treatment duration for Tuberculosis patients in Malaysia based on a real-life patient dataset. Six regression models, namely Support Vecto...

Full description

Saved in:
Bibliographic Details
Main Authors: Balakrishnan, Vimala, Ramanathan, Ghayathri, Zhou, Siyi, Wong, Chee Kuan
Format: Article
Published: Springer 2024
Subjects:
Online Access:http://eprints.um.edu.my/44989/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Machine learning models have emerged as an advanced tool for predicting diseases and their outcomes. This study developed a machine learning model to predict the treatment duration for Tuberculosis patients in Malaysia based on a real-life patient dataset. Six regression models, namely Support Vector Regression, Linear Regression, Lasso Regression, Ridge Regression, Random Forest Regression, and Gradient Boosting Regression were initially developed and then optimized through hyperparameter tuning to determine the best predictive model. Using a dataset of 435 Malaysian Tuberculosis patients, we compared our results with data from countries with high Tuberculosis prevalence rates, namely Belarus, Nigeria, and Georgia. Experimentations revealed Support Vector Regression emerged as the best performing model as it can predict treatment duration with the lowest error rates (Mean Absolute Error = 69.70; Root Mean Squared Error = 109.49). Eight significant risk factors were identified for the Malaysian dataset through Pearson correlation, namely, treatment outcome, treatment status, fixed dose combination dosage, maintenance phase regimen, chest X-ray findings, tuberculin skin test, location of treatment initiation, and levofloxacin-based regimen. Comparison with data from other countries confirmed the consistent performance of the optimized Support Vector Regression model in predicting Tuberculosis treatment duration, hence rendering the model generalizable. To the best of our knowledge, this is the first study to demonstrates the effectiveness of machine learning in predicting Tuberculosis treatment duration based on potential risk factors. These findings will help clinicians make informed decisions about the optimal treatment duration, prepare patients' expectations, and estimate the cost of Tuberculosis treatment. © 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.