Detection of phishing websites using machine learning approaches

As the world responded to the Coronavirus Disease 2019 (COVID-19) pandemic in 2020, digital operations became more important, and people started to depend on new initiatives such as the cloud and mobile infrastructure. Consequently, the number of cyberattacks such as phishing has increased. Phishing...

Full description

Saved in:
Bibliographic Details
Main Authors: Farashazillah Yahya, Magnus Anai, Ryan Isaac W Mahibol, Sidney Allister Frankie, Rio Guntur Utomo, Chong Kim Ying, Eric Ling Nin Wei
Format: Proceedings
Language:English
English
Published: Institute of Electrical and Electronics Engineers 2021
Subjects:
Online Access:https://eprints.ums.edu.my/id/eprint/32525/1/Detection%20of%20phishing%20websites%20using%20machine%20learning%20approaches.ABSTRACT.pdf
https://eprints.ums.edu.my/id/eprint/32525/2/Detection%20of%20phishing%20websites%20using%20machine%20learning%20approaches.pdf
https://eprints.ums.edu.my/id/eprint/32525/
https://ieeexplore.ieee.org/document/9617482
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.ums.eprints.32525
record_format eprints
spelling my.ums.eprints.325252022-05-03T13:27:09Z https://eprints.ums.edu.my/id/eprint/32525/ Detection of phishing websites using machine learning approaches Farashazillah Yahya Magnus Anai Ryan Isaac W Mahibol Sidney Allister Frankie Rio Guntur Utomo Chong Kim Ying Eric Ling Nin Wei Q1-295 General TK5101-6720 Telecommunication Including telegraphy, telephone, radio, radar, television As the world responded to the Coronavirus Disease 2019 (COVID-19) pandemic in 2020, digital operations became more important, and people started to depend on new initiatives such as the cloud and mobile infrastructure. Consequently, the number of cyberattacks such as phishing has increased. Phishing websites can be detected using machine learning by classifying the websites into legitimate or illegitimate websites. The purpose of the study is to conduct a mini-review of the existing techniques and implement experiments to detect whether a website is malicious or not. The dataset consists of 11,055 observations and 32 variables. Three supervised learning models are implemented in this study: Decision Tree, K-Nearest Neighbour (KNN), and Random Forest. The three algorithms are chosen because it provides a better understanding and more suitable for the dataset. Based on the experiments undertaken, the result shows Decision Tree has an accuracy of 91.16% which is the lowest compared to the other models, 97.6% for the KNN model which is the highest among all the models and 94.44% accuracy for the Random Forest model. Through comparisons between the three models, KNN was the prime candidate for the best model considering that it has the highest accuracy. However, Random Forest is deemed more suitable for the dataset even though the accuracy is lesser because of the lowest false-negative value than the other models. The experiments can be further investigated with different datasets and models for comparative analysis. Institute of Electrical and Electronics Engineers 2021 Proceedings PeerReviewed text en https://eprints.ums.edu.my/id/eprint/32525/1/Detection%20of%20phishing%20websites%20using%20machine%20learning%20approaches.ABSTRACT.pdf text en https://eprints.ums.edu.my/id/eprint/32525/2/Detection%20of%20phishing%20websites%20using%20machine%20learning%20approaches.pdf Farashazillah Yahya and Magnus Anai and Ryan Isaac W Mahibol and Sidney Allister Frankie and Rio Guntur Utomo and Chong Kim Ying and Eric Ling Nin Wei (2021) Detection of phishing websites using machine learning approaches. https://ieeexplore.ieee.org/document/9617482
institution Universiti Malaysia Sabah
building UMS Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaysia Sabah
content_source UMS Institutional Repository
url_provider http://eprints.ums.edu.my/
language English
English
topic Q1-295 General
TK5101-6720 Telecommunication Including telegraphy, telephone, radio, radar, television
spellingShingle Q1-295 General
TK5101-6720 Telecommunication Including telegraphy, telephone, radio, radar, television
Farashazillah Yahya
Magnus Anai
Ryan Isaac W Mahibol
Sidney Allister Frankie
Rio Guntur Utomo
Chong Kim Ying
Eric Ling Nin Wei
Detection of phishing websites using machine learning approaches
description As the world responded to the Coronavirus Disease 2019 (COVID-19) pandemic in 2020, digital operations became more important, and people started to depend on new initiatives such as the cloud and mobile infrastructure. Consequently, the number of cyberattacks such as phishing has increased. Phishing websites can be detected using machine learning by classifying the websites into legitimate or illegitimate websites. The purpose of the study is to conduct a mini-review of the existing techniques and implement experiments to detect whether a website is malicious or not. The dataset consists of 11,055 observations and 32 variables. Three supervised learning models are implemented in this study: Decision Tree, K-Nearest Neighbour (KNN), and Random Forest. The three algorithms are chosen because it provides a better understanding and more suitable for the dataset. Based on the experiments undertaken, the result shows Decision Tree has an accuracy of 91.16% which is the lowest compared to the other models, 97.6% for the KNN model which is the highest among all the models and 94.44% accuracy for the Random Forest model. Through comparisons between the three models, KNN was the prime candidate for the best model considering that it has the highest accuracy. However, Random Forest is deemed more suitable for the dataset even though the accuracy is lesser because of the lowest false-negative value than the other models. The experiments can be further investigated with different datasets and models for comparative analysis.
format Proceedings
author Farashazillah Yahya
Magnus Anai
Ryan Isaac W Mahibol
Sidney Allister Frankie
Rio Guntur Utomo
Chong Kim Ying
Eric Ling Nin Wei
author_facet Farashazillah Yahya
Magnus Anai
Ryan Isaac W Mahibol
Sidney Allister Frankie
Rio Guntur Utomo
Chong Kim Ying
Eric Ling Nin Wei
author_sort Farashazillah Yahya
title Detection of phishing websites using machine learning approaches
title_short Detection of phishing websites using machine learning approaches
title_full Detection of phishing websites using machine learning approaches
title_fullStr Detection of phishing websites using machine learning approaches
title_full_unstemmed Detection of phishing websites using machine learning approaches
title_sort detection of phishing websites using machine learning approaches
publisher Institute of Electrical and Electronics Engineers
publishDate 2021
url https://eprints.ums.edu.my/id/eprint/32525/1/Detection%20of%20phishing%20websites%20using%20machine%20learning%20approaches.ABSTRACT.pdf
https://eprints.ums.edu.my/id/eprint/32525/2/Detection%20of%20phishing%20websites%20using%20machine%20learning%20approaches.pdf
https://eprints.ums.edu.my/id/eprint/32525/
https://ieeexplore.ieee.org/document/9617482
_version_ 1760231038283415552
score 13.160551