Comparative study of machine learning algorithms in website phishing detection

Harmful programs that are created to thieve user credentials have become a lot over the recent years, potentially leading to a loss of cash. The methods which are utilized by attackers to collect confidential information vary, when online banking systems continue to be the main goal of these attacks...

Full description

Saved in:
Bibliographic Details
Main Author: Kalybayev, Almukhammed
Format: Thesis
Language:English
Published: 2013
Subjects:
Online Access:http://eprints.utm.my/id/eprint/35828/5/AlmukhammedKalbayevMFSKSM2013.pdf
http://eprints.utm.my/id/eprint/35828/
http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:70363?site_name=Restricted Repository
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utm.35828
record_format eprints
spelling my.utm.358282017-06-29T06:40:27Z http://eprints.utm.my/id/eprint/35828/ Comparative study of machine learning algorithms in website phishing detection Kalybayev, Almukhammed TK Electrical engineering. Electronics Nuclear engineering Harmful programs that are created to thieve user credentials have become a lot over the recent years, potentially leading to a loss of cash. The methods which are utilized by attackers to collect confidential information vary, when online banking systems continue to be the main goal of these attacks. Nowadays most widespread approach to protect against phishing attack is using blacklists in antiviruses and browser toolbars. Unfortunately, blacklist method fails in responding to newly emanating phishing attacks since registering new domain names has become easier, no comprehensive blacklist can ensure a perfect up-to-date database. Therefore it requires another approach to counter phishing attack which is more accurate and efficient than blacklist method. The purpose of this work is to evaluate and analyze the effectiveness of applying machine learning algorithms such as an Artificial Neural Network, Support Vector Machines and K-nearest Neighbor to website phishing detection. The datasets of phishing and non-phishing websites were gathered in order to train, test machine learning algorithm models, compare evaluative metrics of algorithms between each other. In addition, the final dataset was divided into three datasets with different ratios to see whether or not the trained models will show constant performance in testing results and whether these proportions have a good or bad influence on the ability of trained models to classify website. After all the analysis of the performance of each machine learning algorithm was made. This project suggests the Support Vector Machines algorithm as the best one to be used in phishing detection regardless of dataset proportion, because it showed almost the same performance throughout all test phases which is 98.5% on average. 2013-08 Thesis NonPeerReviewed application/pdf en http://eprints.utm.my/id/eprint/35828/5/AlmukhammedKalbayevMFSKSM2013.pdf Kalybayev, Almukhammed (2013) Comparative study of machine learning algorithms in website phishing detection. Masters thesis, Universiti Teknologi Malaysia, Faculty of Computing. http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:70363?site_name=Restricted Repository
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
language English
topic TK Electrical engineering. Electronics Nuclear engineering
spellingShingle TK Electrical engineering. Electronics Nuclear engineering
Kalybayev, Almukhammed
Comparative study of machine learning algorithms in website phishing detection
description Harmful programs that are created to thieve user credentials have become a lot over the recent years, potentially leading to a loss of cash. The methods which are utilized by attackers to collect confidential information vary, when online banking systems continue to be the main goal of these attacks. Nowadays most widespread approach to protect against phishing attack is using blacklists in antiviruses and browser toolbars. Unfortunately, blacklist method fails in responding to newly emanating phishing attacks since registering new domain names has become easier, no comprehensive blacklist can ensure a perfect up-to-date database. Therefore it requires another approach to counter phishing attack which is more accurate and efficient than blacklist method. The purpose of this work is to evaluate and analyze the effectiveness of applying machine learning algorithms such as an Artificial Neural Network, Support Vector Machines and K-nearest Neighbor to website phishing detection. The datasets of phishing and non-phishing websites were gathered in order to train, test machine learning algorithm models, compare evaluative metrics of algorithms between each other. In addition, the final dataset was divided into three datasets with different ratios to see whether or not the trained models will show constant performance in testing results and whether these proportions have a good or bad influence on the ability of trained models to classify website. After all the analysis of the performance of each machine learning algorithm was made. This project suggests the Support Vector Machines algorithm as the best one to be used in phishing detection regardless of dataset proportion, because it showed almost the same performance throughout all test phases which is 98.5% on average.
format Thesis
author Kalybayev, Almukhammed
author_facet Kalybayev, Almukhammed
author_sort Kalybayev, Almukhammed
title Comparative study of machine learning algorithms in website phishing detection
title_short Comparative study of machine learning algorithms in website phishing detection
title_full Comparative study of machine learning algorithms in website phishing detection
title_fullStr Comparative study of machine learning algorithms in website phishing detection
title_full_unstemmed Comparative study of machine learning algorithms in website phishing detection
title_sort comparative study of machine learning algorithms in website phishing detection
publishDate 2013
url http://eprints.utm.my/id/eprint/35828/5/AlmukhammedKalbayevMFSKSM2013.pdf
http://eprints.utm.my/id/eprint/35828/
http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:70363?site_name=Restricted Repository
_version_ 1643649852532326400
score 13.160551