Prediction of breast cancer diagnosis using machine learning in Malaysian women

Breast cancer is the most prevalent cancer in the world and the main cause of cancer mortality in the twelve regions of the world. Thus, there is a need for efficient screening and diagnosis of the disease. Thus, this thesis aims to explore the use of machine learning (ML) for breast cancer risk est...

Full description

Saved in:
Bibliographic Details
Main Author: Mokhtar, Tengku Muhammad Hanis Tengku
Format: Thesis
Language:English
Published: 2024
Subjects:
Online Access:http://eprints.usm.my/60999/1/TENGKU%20MUHAMMAD%20HANIS%20BIN%20TENGKU%20MOKHTAR-E.pdf
http://eprints.usm.my/60999/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.usm.eprints.60999
record_format eprints
spelling my.usm.eprints.60999 http://eprints.usm.my/60999/ Prediction of breast cancer diagnosis using machine learning in Malaysian women Mokhtar, Tengku Muhammad Hanis Tengku R Medicine RA440-440.87 Study and teaching. Research RC254-282 Neoplasms. Tumors. Oncology (including Cancer) Breast cancer is the most prevalent cancer in the world and the main cause of cancer mortality in the twelve regions of the world. Thus, there is a need for efficient screening and diagnosis of the disease. Thus, this thesis aims to explore the use of machine learning (ML) for breast cancer risk estimation and prediction. This thesis included six interrelated projects starting from Chapter 2 to Chapter 7. Chapter 2 presents an overview of breast cancer research in Malaysia. A bibliometric analysis was used to describe the research activities of breast cancer research in Malaysia. This project revealed there was no dominant research area in breast cancer research in Malaysia. Additionally, the study found that two growing research themes related to breast cancer in Malaysia were precision medicine and deep learning. Chapter 3 explored the most cited global research related to breast cancer and ML. This project also utilised bibliometric analysis applied to the most cited papers related to breast cancer and ML. This project found that there was a strong interest in the application of ML to breast cancer in the last three decades. The three frequently used ML algorithms were deep learning, support vector machine (SVM), and cluster analysis. In Chapter 4, factors influencing mammographic density among Asian women including Malaysia women were investigated. The study utilised a multiple imputation approach to overcome a missing data issue and a logistic regression to analyse the data. Five factors affecting mammographic density were age, number of children, body mass index, menopause status, and breast imaging-reporting and data system (BI-RADS) classification. The study in Chapter 5 explored the use of patient registration records and ML for breast cancer risk estimation. The ML model developed in this chapter could be used as an over-the-counter screening (OTC) model for women attending breast clinics. Eight ML algorithms were explored in this project. k-nearest neighbour (kNN) models had a significantly better performance compared to the other seven models. Additionally, Chapter 6 presents a meta-analysis of ML models on breast cancer classification. This project seeks to establish the diagnostic accuracy of ML used on mammographic data. This project found that neural network, deep learning, tree-based models, and SVM performed well on mammographic data for breast cancer detection. The study established the good diagnostic accuracy of ML in this area of research, thus, further supporting the use of ML in this area, especially for screening and supplementary diagnostic tools. Lastly, the study in Chapter 7 explored the use of an ensemble of pre-trained networks for breast abnormality classification using digital mammograms. This project explored thirteen pre-trained networks as candidates for the ensemble model. Each network was further fine-tuned, and the top networks were used to develop the ensemble model. The ensemble pre-trained network displayed a good performance in classifying the normal and suspicious mammograms. In conclusion, this thesis highlights the potential of ML in breast cancer risk estimation and prediction. The findings of this thesis contribute to the growing body of literature on ML in breast cancer research and provide valuable insights for future research in this area. 2024-03 Thesis NonPeerReviewed application/pdf en http://eprints.usm.my/60999/1/TENGKU%20MUHAMMAD%20HANIS%20BIN%20TENGKU%20MOKHTAR-E.pdf Mokhtar, Tengku Muhammad Hanis Tengku (2024) Prediction of breast cancer diagnosis using machine learning in Malaysian women. PhD thesis, Universiti Sains Malaysia.
institution Universiti Sains Malaysia
building Hamzah Sendut Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Sains Malaysia
content_source USM Institutional Repository
url_provider http://eprints.usm.my/
language English
topic R Medicine
RA440-440.87 Study and teaching. Research
RC254-282 Neoplasms. Tumors. Oncology (including Cancer)
spellingShingle R Medicine
RA440-440.87 Study and teaching. Research
RC254-282 Neoplasms. Tumors. Oncology (including Cancer)
Mokhtar, Tengku Muhammad Hanis Tengku
Prediction of breast cancer diagnosis using machine learning in Malaysian women
description Breast cancer is the most prevalent cancer in the world and the main cause of cancer mortality in the twelve regions of the world. Thus, there is a need for efficient screening and diagnosis of the disease. Thus, this thesis aims to explore the use of machine learning (ML) for breast cancer risk estimation and prediction. This thesis included six interrelated projects starting from Chapter 2 to Chapter 7. Chapter 2 presents an overview of breast cancer research in Malaysia. A bibliometric analysis was used to describe the research activities of breast cancer research in Malaysia. This project revealed there was no dominant research area in breast cancer research in Malaysia. Additionally, the study found that two growing research themes related to breast cancer in Malaysia were precision medicine and deep learning. Chapter 3 explored the most cited global research related to breast cancer and ML. This project also utilised bibliometric analysis applied to the most cited papers related to breast cancer and ML. This project found that there was a strong interest in the application of ML to breast cancer in the last three decades. The three frequently used ML algorithms were deep learning, support vector machine (SVM), and cluster analysis. In Chapter 4, factors influencing mammographic density among Asian women including Malaysia women were investigated. The study utilised a multiple imputation approach to overcome a missing data issue and a logistic regression to analyse the data. Five factors affecting mammographic density were age, number of children, body mass index, menopause status, and breast imaging-reporting and data system (BI-RADS) classification. The study in Chapter 5 explored the use of patient registration records and ML for breast cancer risk estimation. The ML model developed in this chapter could be used as an over-the-counter screening (OTC) model for women attending breast clinics. Eight ML algorithms were explored in this project. k-nearest neighbour (kNN) models had a significantly better performance compared to the other seven models. Additionally, Chapter 6 presents a meta-analysis of ML models on breast cancer classification. This project seeks to establish the diagnostic accuracy of ML used on mammographic data. This project found that neural network, deep learning, tree-based models, and SVM performed well on mammographic data for breast cancer detection. The study established the good diagnostic accuracy of ML in this area of research, thus, further supporting the use of ML in this area, especially for screening and supplementary diagnostic tools. Lastly, the study in Chapter 7 explored the use of an ensemble of pre-trained networks for breast abnormality classification using digital mammograms. This project explored thirteen pre-trained networks as candidates for the ensemble model. Each network was further fine-tuned, and the top networks were used to develop the ensemble model. The ensemble pre-trained network displayed a good performance in classifying the normal and suspicious mammograms. In conclusion, this thesis highlights the potential of ML in breast cancer risk estimation and prediction. The findings of this thesis contribute to the growing body of literature on ML in breast cancer research and provide valuable insights for future research in this area.
format Thesis
author Mokhtar, Tengku Muhammad Hanis Tengku
author_facet Mokhtar, Tengku Muhammad Hanis Tengku
author_sort Mokhtar, Tengku Muhammad Hanis Tengku
title Prediction of breast cancer diagnosis using machine learning in Malaysian women
title_short Prediction of breast cancer diagnosis using machine learning in Malaysian women
title_full Prediction of breast cancer diagnosis using machine learning in Malaysian women
title_fullStr Prediction of breast cancer diagnosis using machine learning in Malaysian women
title_full_unstemmed Prediction of breast cancer diagnosis using machine learning in Malaysian women
title_sort prediction of breast cancer diagnosis using machine learning in malaysian women
publishDate 2024
url http://eprints.usm.my/60999/1/TENGKU%20MUHAMMAD%20HANIS%20BIN%20TENGKU%20MOKHTAR-E.pdf
http://eprints.usm.my/60999/
_version_ 1809137803898912768
score 13.211869