Accurate identification of thirteen fly species from three families using wing venation patterns with machine learning approaches / Ling Min Hao

The ease and the affordability of image data acquisition have made whole-image analysis an attractive analytical approach in biological research. Coupled with machine learning, whole-image analysis has the potential to complement or even supplant traditional morphometric approaches for species ident...

Full description

Saved in:
Bibliographic Details
Main Author: Ling Min Hao, Min Hao
Format: Thesis
Published: 2024
Subjects:
Online Access:http://studentsrepo.um.edu.my/15780/1/Ling_Min_Hao.pdf
http://studentsrepo.um.edu.my/15780/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1839752406084091904
author Ling Min Hao, Min Hao
author_facet Ling Min Hao, Min Hao
author_sort Ling Min Hao, Min Hao
building UM Library
collection Institutional Repository
content_provider Universiti Malaya
content_source UM Student Repository
continent Asia
country Malaysia
description The ease and the affordability of image data acquisition have made whole-image analysis an attractive analytical approach in biological research. Coupled with machine learning, whole-image analysis has the potential to complement or even supplant traditional morphometric approaches for species identification in medical, veterinary, and forensic entomology. Here, I used a substantially expanded dataset (n = 759; 13 species and a species variant; 3 families) to consolidate findings from a pilot study (n = 74; 15 species; 2 families) for automated species identification of fly species based on their wing venation patterns, using classical Krawtchouk moment invariants coupled with a random forest model. To leverage on state-on-the-art methods on image analysis, I conducted a comparative analysis using ResNet, a deep learning model. Five-fold cross validation results show impressive mean identification accuracies of 98.56 ± 0.38% and 99.60 ± 0.27% at the family level, and 91.04 ± 1.33% and 97.87 ± 1.01% at the species level, for the classical and deep learning approaches, respectively. Additionally, the mean F1- scores of 0.89 ± 0.02 and 0.97 ± 0.01 respectively indicate a good balance of precision and recall for both models. Importantly, the regions on the fly wings that are used by ResNet for species identification were successfully visualised using Grad-CAM heatmaps, thus facilitating the interpretation of putative biological bases of identifications using ResNet. In summary, this study demonstrates the extent to which species differences in the studied dipteran species can be expressed in wing morphology, both quantitatively and qualitatively, through image data. Specifically, the findings from interpretable deep learning are potentially useful for generating hypotheses about putative wing anatomies that hold taxonomic value.
format Thesis
id my.um.stud-15780
institution Universiti Malaya
publishDate 2024
record_format eprints
spelling my.um.stud-157802025-08-04T00:07:41Z Accurate identification of thirteen fly species from three families using wing venation patterns with machine learning approaches / Ling Min Hao Ling Min Hao, Min Hao Q Science (General) QA Mathematics The ease and the affordability of image data acquisition have made whole-image analysis an attractive analytical approach in biological research. Coupled with machine learning, whole-image analysis has the potential to complement or even supplant traditional morphometric approaches for species identification in medical, veterinary, and forensic entomology. Here, I used a substantially expanded dataset (n = 759; 13 species and a species variant; 3 families) to consolidate findings from a pilot study (n = 74; 15 species; 2 families) for automated species identification of fly species based on their wing venation patterns, using classical Krawtchouk moment invariants coupled with a random forest model. To leverage on state-on-the-art methods on image analysis, I conducted a comparative analysis using ResNet, a deep learning model. Five-fold cross validation results show impressive mean identification accuracies of 98.56 ± 0.38% and 99.60 ± 0.27% at the family level, and 91.04 ± 1.33% and 97.87 ± 1.01% at the species level, for the classical and deep learning approaches, respectively. Additionally, the mean F1- scores of 0.89 ± 0.02 and 0.97 ± 0.01 respectively indicate a good balance of precision and recall for both models. Importantly, the regions on the fly wings that are used by ResNet for species identification were successfully visualised using Grad-CAM heatmaps, thus facilitating the interpretation of putative biological bases of identifications using ResNet. In summary, this study demonstrates the extent to which species differences in the studied dipteran species can be expressed in wing morphology, both quantitatively and qualitatively, through image data. Specifically, the findings from interpretable deep learning are potentially useful for generating hypotheses about putative wing anatomies that hold taxonomic value. 2024-01 Thesis NonPeerReviewed application/pdf http://studentsrepo.um.edu.my/15780/1/Ling_Min_Hao.pdf Ling Min Hao, Min Hao (2024) Accurate identification of thirteen fly species from three families using wing venation patterns with machine learning approaches / Ling Min Hao. Masters thesis, Universiti Malaya. http://studentsrepo.um.edu.my/15780/
spellingShingle Q Science (General)
QA Mathematics
Ling Min Hao, Min Hao
Accurate identification of thirteen fly species from three families using wing venation patterns with machine learning approaches / Ling Min Hao
title Accurate identification of thirteen fly species from three families using wing venation patterns with machine learning approaches / Ling Min Hao
title_full Accurate identification of thirteen fly species from three families using wing venation patterns with machine learning approaches / Ling Min Hao
title_fullStr Accurate identification of thirteen fly species from three families using wing venation patterns with machine learning approaches / Ling Min Hao
title_full_unstemmed Accurate identification of thirteen fly species from three families using wing venation patterns with machine learning approaches / Ling Min Hao
title_short Accurate identification of thirteen fly species from three families using wing venation patterns with machine learning approaches / Ling Min Hao
title_sort accurate identification of thirteen fly species from three families using wing venation patterns with machine learning approaches / ling min hao
topic Q Science (General)
QA Mathematics
url http://studentsrepo.um.edu.my/15780/1/Ling_Min_Hao.pdf
http://studentsrepo.um.edu.my/15780/
url_provider http://studentsrepo.um.edu.my/