Staff View: Phishing image spam classification research trends: Survey and open issues

Phishing image spam classification research trends: Survey and open issues

A phishing email is an attack that focused completely on people to circumvent existing traditional security algorithms. The email appears to be a dependable, appropriate, and solid communication medium for internet users. At present, the email is submerged with spam content, both in text-based form...

Full description

Saved in:

Bibliographic Details
Main Authors:	John Abari, Ovye, Mohd Sani, Nor Fazlida, Khalid, Fatimah, Mohd Yunus Bin Sharum, Mohd Yunus, Mohd Ariffin, Noor Afiza
Format:	Article
Published:	The Science and Information Organization 2020
Online Access:	http://psasir.upm.edu.my/id/eprint/87151/ https://thesai.org/Publications/ViewPaper?Volume=11&Issue=11&Code=IJACSA&SerialNo=96
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.upm.eprints.87151
record_format	eprints
spelling	my.upm.eprints.871512024-05-17T04:14:03Z http://psasir.upm.edu.my/id/eprint/87151/ Phishing image spam classification research trends: Survey and open issues John Abari, Ovye Mohd Sani, Nor Fazlida Khalid, Fatimah Mohd Yunus Bin Sharum, Mohd Yunus Mohd Ariffin, Noor Afiza A phishing email is an attack that focused completely on people to circumvent existing traditional security algorithms. The email appears to be a dependable, appropriate, and solid communication medium for internet users. At present, the email is submerged with spam content, both in text-based form or undesired text planted inside the images. This study reviews articles on phishing image spam classification published from 2006 to 2020 based on spam classification application domains, datasets, features sets, spam classification methods, and the measurement metrics adopted in the existing studies. More than 50 articles, both from Web of Science and Scopus databases were picked. Achieving the study’s target, we carried out a broad survey and analysis to identify the domains where spam classification was applied. Furthermore, several public data sets, features set, classification methods, and measuring metrics are found and the popular once were pinpointed. The study revealed that Personal Collection, Dredze, and Spam Archives datasets are the most commonly used datasets in image spam classification research. Low-level and image metadata are the most widely used features set. The methods of image spam classification as identified in this study are supervised machine learning, unsupervised machine learning, semi-supervised machine learning, content-based and statistical learning. Among these methods, the most commonly utilized is the Support Vector Machine (SVM) which falls under supervised machine learning. This is followed by Na¨ıve Bayes and K-Nearest Neighbor. The commonly adopted metrics for the performance evaluation of the existing image spam classifiers are also identified and briefly discussed. We compared the performance of the state-of-the-art image spam models. Lastly, we pointed out promising directions for future research. The Science and Information Organization 2020 Article PeerReviewed John Abari, Ovye and Mohd Sani, Nor Fazlida and Khalid, Fatimah and Mohd Yunus Bin Sharum, Mohd Yunus and Mohd Ariffin, Noor Afiza (2020) Phishing image spam classification research trends: Survey and open issues. International Journal of Advanced Computer Science and Applications, 11 (11). 794 - 805. ISSN 2156-5570; ESSN: 2158-107X https://thesai.org/Publications/ViewPaper?Volume=11&Issue=11&Code=IJACSA&SerialNo=96 10.14569/ijacsa.2020.0111196
institution	Universiti Putra Malaysia
building	UPM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Putra Malaysia
content_source	UPM Institutional Repository
url_provider	http://psasir.upm.edu.my/
description	A phishing email is an attack that focused completely on people to circumvent existing traditional security algorithms. The email appears to be a dependable, appropriate, and solid communication medium for internet users. At present, the email is submerged with spam content, both in text-based form or undesired text planted inside the images. This study reviews articles on phishing image spam classification published from 2006 to 2020 based on spam classification application domains, datasets, features sets, spam classification methods, and the measurement metrics adopted in the existing studies. More than 50 articles, both from Web of Science and Scopus databases were picked. Achieving the study’s target, we carried out a broad survey and analysis to identify the domains where spam classification was applied. Furthermore, several public data sets, features set, classification methods, and measuring metrics are found and the popular once were pinpointed. The study revealed that Personal Collection, Dredze, and Spam Archives datasets are the most commonly used datasets in image spam classification research. Low-level and image metadata are the most widely used features set. The methods of image spam classification as identified in this study are supervised machine learning, unsupervised machine learning, semi-supervised machine learning, content-based and statistical learning. Among these methods, the most commonly utilized is the Support Vector Machine (SVM) which falls under supervised machine learning. This is followed by Na¨ıve Bayes and K-Nearest Neighbor. The commonly adopted metrics for the performance evaluation of the existing image spam classifiers are also identified and briefly discussed. We compared the performance of the state-of-the-art image spam models. Lastly, we pointed out promising directions for future research.
format	Article
author	John Abari, Ovye Mohd Sani, Nor Fazlida Khalid, Fatimah Mohd Yunus Bin Sharum, Mohd Yunus Mohd Ariffin, Noor Afiza
spellingShingle	John Abari, Ovye Mohd Sani, Nor Fazlida Khalid, Fatimah Mohd Yunus Bin Sharum, Mohd Yunus Mohd Ariffin, Noor Afiza Phishing image spam classification research trends: Survey and open issues
author_facet	John Abari, Ovye Mohd Sani, Nor Fazlida Khalid, Fatimah Mohd Yunus Bin Sharum, Mohd Yunus Mohd Ariffin, Noor Afiza
author_sort	John Abari, Ovye
title	Phishing image spam classification research trends: Survey and open issues
title_short	Phishing image spam classification research trends: Survey and open issues
title_full	Phishing image spam classification research trends: Survey and open issues
title_fullStr	Phishing image spam classification research trends: Survey and open issues
title_full_unstemmed	Phishing image spam classification research trends: Survey and open issues
title_sort	phishing image spam classification research trends: survey and open issues
publisher	The Science and Information Organization
publishDate	2020
url	http://psasir.upm.edu.my/id/eprint/87151/ https://thesai.org/Publications/ViewPaper?Volume=11&Issue=11&Code=IJACSA&SerialNo=96
_version_	1800093841784569856
score	13.159267

Phishing image spam classification research trends: Survey and open issues

Similar Items