Phishing Detection With Identity Keywords and Target Domain Name

This thesis describes the research work carried out to address the problem of phishing detection and the weaknesses in existing anti-phishing methods. Phishing works by luring users to counterfeit websites, where highly confidential credentials are requested. To safeguard Internet users against phis...

Full description

Saved in:
Bibliographic Details
Main Author: Tan, Colin Choon Lin
Format: Thesis
Language:English
English
Published: unimas 2015
Subjects:
Online Access:http://ir.unimas.my/id/eprint/21452/1/Phishing%20Detection%20With%20Identity%20Keywords%2024pgs.pdf
http://ir.unimas.my/id/eprint/21452/8/Colin.pdf
http://ir.unimas.my/id/eprint/21452/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.unimas.ir.21452
record_format eprints
spelling my.unimas.ir.214522023-08-24T04:40:31Z http://ir.unimas.my/id/eprint/21452/ Phishing Detection With Identity Keywords and Target Domain Name Tan, Colin Choon Lin Q Science (General) QA75 Electronic computers. Computer science This thesis describes the research work carried out to address the problem of phishing detection and the weaknesses in existing anti-phishing methods. Phishing works by luring users to counterfeit websites, where highly confidential credentials are requested. To safeguard Internet users against phishing attacks, a hybrid anti-phishing method consisting of text-based, search engine-based and identity-based methods are proposed, where the differences between the target and actual identities of a webpage are exploited for classification. The proposed method can be divided into three phases. The first phase extracts identity keywords from the textual contents of the website, where a novel weighted URL tokens system based on the N-gram model is proposed. The second phase finds the target domain name by using a search engine, and the target domain name is selected based on identity-relevant features. In the final phase, a 3-tier identity matching system exploits indirect identity relationships to conclude the legitimacy of the query webpage. Experiments were conducted over 10,000 datasets, where true positive rate of 99.68% and true negative rate of 92.52% were achieved. Benchmarking results also suggest that the proposed method achieves comparable overall accuracy with three selected conventional methods. In summary, the proposed method has the key advantage of identifying phishing webpages accurately. This key advantage is highly desirable in anti-phishing applications. unimas 2015 Thesis NonPeerReviewed text en http://ir.unimas.my/id/eprint/21452/1/Phishing%20Detection%20With%20Identity%20Keywords%2024pgs.pdf text en http://ir.unimas.my/id/eprint/21452/8/Colin.pdf Tan, Colin Choon Lin (2015) Phishing Detection With Identity Keywords and Target Domain Name. Masters thesis, UNIMAS.
institution Universiti Malaysia Sarawak
building Centre for Academic Information Services (CAIS)
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaysia Sarawak
content_source UNIMAS Institutional Repository
url_provider http://ir.unimas.my/
language English
English
topic Q Science (General)
QA75 Electronic computers. Computer science
spellingShingle Q Science (General)
QA75 Electronic computers. Computer science
Tan, Colin Choon Lin
Phishing Detection With Identity Keywords and Target Domain Name
description This thesis describes the research work carried out to address the problem of phishing detection and the weaknesses in existing anti-phishing methods. Phishing works by luring users to counterfeit websites, where highly confidential credentials are requested. To safeguard Internet users against phishing attacks, a hybrid anti-phishing method consisting of text-based, search engine-based and identity-based methods are proposed, where the differences between the target and actual identities of a webpage are exploited for classification. The proposed method can be divided into three phases. The first phase extracts identity keywords from the textual contents of the website, where a novel weighted URL tokens system based on the N-gram model is proposed. The second phase finds the target domain name by using a search engine, and the target domain name is selected based on identity-relevant features. In the final phase, a 3-tier identity matching system exploits indirect identity relationships to conclude the legitimacy of the query webpage. Experiments were conducted over 10,000 datasets, where true positive rate of 99.68% and true negative rate of 92.52% were achieved. Benchmarking results also suggest that the proposed method achieves comparable overall accuracy with three selected conventional methods. In summary, the proposed method has the key advantage of identifying phishing webpages accurately. This key advantage is highly desirable in anti-phishing applications.
format Thesis
author Tan, Colin Choon Lin
author_facet Tan, Colin Choon Lin
author_sort Tan, Colin Choon Lin
title Phishing Detection With Identity Keywords and Target Domain Name
title_short Phishing Detection With Identity Keywords and Target Domain Name
title_full Phishing Detection With Identity Keywords and Target Domain Name
title_fullStr Phishing Detection With Identity Keywords and Target Domain Name
title_full_unstemmed Phishing Detection With Identity Keywords and Target Domain Name
title_sort phishing detection with identity keywords and target domain name
publisher unimas
publishDate 2015
url http://ir.unimas.my/id/eprint/21452/1/Phishing%20Detection%20With%20Identity%20Keywords%2024pgs.pdf
http://ir.unimas.my/id/eprint/21452/8/Colin.pdf
http://ir.unimas.my/id/eprint/21452/
_version_ 1775627227091697664
score 13.159267