Cascade Generalization Based Functional Tree for Website Phishing Detection

The advent of the web and internet space has seen its adoption for rendering various services -from financial to medical services. This has brought an increase in the rate of cybersecurity issues over the years and a prominent one is the phishing attack where malicious websites mimic the appearance...

Full description

Saved in:
Bibliographic Details
Main Authors: Balogun, A.O., Adewole, K.S., Bajeh, A.O., Jimoh, R.G.
Format: Article
Published: Springer Science and Business Media Deutschland GmbH 2021
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85121900102&doi=10.1007%2f978-981-16-8059-5_17&partnerID=40&md5=1d34d6463d72d99a490d7893201078bd
http://eprints.utp.edu.my/29328/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The advent of the web and internet space has seen its adoption for rendering various services -from financial to medical services. This has brought an increase in the rate of cybersecurity issues over the years and a prominent one is the phishing attack where malicious websites mimic the appearance and functionalities of another legitimate website to collect users� credentials required for access to services. Several measures have been proposed to mitigate this attack; blacklisting and variants of machine learning approaches have been employed, yielding good performance results. However, there is a need to increase the rate of identification of phishing attacks and reduce the rate of false positives. This study proposes the use of a functional tree (FT) machine learning approach to mitigate phishing attacks. FT, a hybridization of multivariate decision trees and discriminant function using constructive induction, uses logistic regression for splitting tree nodes and leaf prediction, unlike the conventional decision tree that simply split nodes based on the data. Furthermore, a variant of the FT is proposed based on cascade generalization (CG-FT). Three datasets with varied instance distributions, both balanced and imbalanced, are used in the empirical investigation of the performance of the proposed CG-FT. The results showed that FT has improved performances over some selected baseline classifiers. Relative to FT, the CG-FT techniques showed improvement in the detection of a phishing attack with Area Under the Curve (AUC) and True Positive rate (TP-rate) ranging from 98�99.6 and 92�97 respectively in the datasets. Also, the false-positive rate is reduced with values ranging from 1.7 to 6.1. The proposed CG-FT showed improvement over all the other reviewed approaches based on studied performance metrics. The use of FT and its hybridization with cascade generalization (CG-FT) showed an improvement in performance in the mitigation of phishing attacks. © 2021, Springer Nature Singapore Pte Ltd.