Applying data augmentation technique on blast-induced overbreak prediction : resolving the problem of data shortage and data imbalance

Blast-induced overbreak in tunnels can cause severe damage and has therefore been a main concern in tunnel blasting. Researchers have developed many machine learning-based models to predict overbreak. Collecting overbreak data manually, however, can be challenging and might obtain insufficient or...

Full description

Saved in:
Bibliographic Details
Main Authors: Biao, He, Danial Jahed, Armaghani, Lai, Sai Hin, Pijush, Samui, Edy Tonnizam, Mohamad
Format: Article
Language:English
Published: Elsevier Ltd. 2023
Subjects:
Online Access:http://ir.unimas.my/id/eprint/44854/2/Applying%20data%20augmentation%20-%20Copy.pdf
http://ir.unimas.my/id/eprint/44854/
https://www.sciencedirect.com/science/article/pii/S0957417423021188
https://doi.org/10.1016/j.eswa.2023.121616
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.unimas.ir.44854
record_format eprints
spelling my.unimas.ir.448542024-05-27T02:02:08Z http://ir.unimas.my/id/eprint/44854/ Applying data augmentation technique on blast-induced overbreak prediction : resolving the problem of data shortage and data imbalance Biao, He Danial Jahed, Armaghani Lai, Sai Hin Pijush, Samui Edy Tonnizam, Mohamad TA Engineering (General). Civil engineering (General) Blast-induced overbreak in tunnels can cause severe damage and has therefore been a main concern in tunnel blasting. Researchers have developed many machine learning-based models to predict overbreak. Collecting overbreak data manually, however, can be challenging and might obtain insufficient or poorly structured data. Thus, this study aims to utilise a deep generative model, namely the Conditional Tabular Generative Adversarial Network (CTGAN), to establish an acceptable dataset for overbreak prediction. The CTGAN model was applied to overbreak data collected from paired tunnels: a left-line tunnel and a right-line tunnel. The overbreak dataset collected from the left-line tunnel—nominated as the true dataset—served to train the CTGAN model. Then the well-trained CTGAN model generated a synthetic overbreak dataset. Statistical-based approaches verified the similarity between the true and synthetic datasets; machine learning-based approaches verified the feasibility of using the synthetic dataset to train overbreak prediction model. Lastly, this study clarified how to resolve the problem of data shortage and data imbalance by leveraging the CTGAN model. The results evidence that the CTGAN model can effectively generate a high-quality synthetic overbreak dataset. The synthetic overbreak dataset not only greatly retains the properties of the true dataset but also effectively enhances its diversity. The way, integrating the true and synthetic overbreak datasets, can dramatically resolve the problem of data shortage and data imbalance in overbreak prediction. The findings in this study, therefore, highlight it as a promising perspective to resolve such a particular engineering problem. Elsevier Ltd. 2023 Article PeerReviewed text en http://ir.unimas.my/id/eprint/44854/2/Applying%20data%20augmentation%20-%20Copy.pdf Biao, He and Danial Jahed, Armaghani and Lai, Sai Hin and Pijush, Samui and Edy Tonnizam, Mohamad (2023) Applying data augmentation technique on blast-induced overbreak prediction : resolving the problem of data shortage and data imbalance. Expert Systems With Applications, 237 (Pt.C). pp. 1-14. ISSN 0957-4174 https://www.sciencedirect.com/science/article/pii/S0957417423021188 https://doi.org/10.1016/j.eswa.2023.121616
institution Universiti Malaysia Sarawak
building Centre for Academic Information Services (CAIS)
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaysia Sarawak
content_source UNIMAS Institutional Repository
url_provider http://ir.unimas.my/
language English
topic TA Engineering (General). Civil engineering (General)
spellingShingle TA Engineering (General). Civil engineering (General)
Biao, He
Danial Jahed, Armaghani
Lai, Sai Hin
Pijush, Samui
Edy Tonnizam, Mohamad
Applying data augmentation technique on blast-induced overbreak prediction : resolving the problem of data shortage and data imbalance
description Blast-induced overbreak in tunnels can cause severe damage and has therefore been a main concern in tunnel blasting. Researchers have developed many machine learning-based models to predict overbreak. Collecting overbreak data manually, however, can be challenging and might obtain insufficient or poorly structured data. Thus, this study aims to utilise a deep generative model, namely the Conditional Tabular Generative Adversarial Network (CTGAN), to establish an acceptable dataset for overbreak prediction. The CTGAN model was applied to overbreak data collected from paired tunnels: a left-line tunnel and a right-line tunnel. The overbreak dataset collected from the left-line tunnel—nominated as the true dataset—served to train the CTGAN model. Then the well-trained CTGAN model generated a synthetic overbreak dataset. Statistical-based approaches verified the similarity between the true and synthetic datasets; machine learning-based approaches verified the feasibility of using the synthetic dataset to train overbreak prediction model. Lastly, this study clarified how to resolve the problem of data shortage and data imbalance by leveraging the CTGAN model. The results evidence that the CTGAN model can effectively generate a high-quality synthetic overbreak dataset. The synthetic overbreak dataset not only greatly retains the properties of the true dataset but also effectively enhances its diversity. The way, integrating the true and synthetic overbreak datasets, can dramatically resolve the problem of data shortage and data imbalance in overbreak prediction. The findings in this study, therefore, highlight it as a promising perspective to resolve such a particular engineering problem.
format Article
author Biao, He
Danial Jahed, Armaghani
Lai, Sai Hin
Pijush, Samui
Edy Tonnizam, Mohamad
author_facet Biao, He
Danial Jahed, Armaghani
Lai, Sai Hin
Pijush, Samui
Edy Tonnizam, Mohamad
author_sort Biao, He
title Applying data augmentation technique on blast-induced overbreak prediction : resolving the problem of data shortage and data imbalance
title_short Applying data augmentation technique on blast-induced overbreak prediction : resolving the problem of data shortage and data imbalance
title_full Applying data augmentation technique on blast-induced overbreak prediction : resolving the problem of data shortage and data imbalance
title_fullStr Applying data augmentation technique on blast-induced overbreak prediction : resolving the problem of data shortage and data imbalance
title_full_unstemmed Applying data augmentation technique on blast-induced overbreak prediction : resolving the problem of data shortage and data imbalance
title_sort applying data augmentation technique on blast-induced overbreak prediction : resolving the problem of data shortage and data imbalance
publisher Elsevier Ltd.
publishDate 2023
url http://ir.unimas.my/id/eprint/44854/2/Applying%20data%20augmentation%20-%20Copy.pdf
http://ir.unimas.my/id/eprint/44854/
https://www.sciencedirect.com/science/article/pii/S0957417423021188
https://doi.org/10.1016/j.eswa.2023.121616
_version_ 1800728210075287552
score 13.160551