Validation of individual identification through decision tree packet header profiling

The drastic rise in the cybercrime rate associated with the surge of users' dependence on the Internet has elevated the concern of digital forensic examiners toward the footprints of perpetrators left in a virtual environment. However, suspect identification is a big challenge in network fo...

Full description

Saved in:
Bibliographic Details
Main Authors: Khairul Osman,, T'ng, Qi Feng, Hairee Izzam Mohd Noor,, Noor Hazfalinda Hamzah,, Gina Francesca Gabriel,
Format: Article
Language:English
Published: Penerbit Universiti Kebangsaan Malaysia 2022
Online Access:http://journalarticle.ukm.my/20851/1/8.pdf
http://journalarticle.ukm.my/20851/
https://www.ukm.my/apjitm/articles-issues
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The drastic rise in the cybercrime rate associated with the surge of users' dependence on the Internet has elevated the concern of digital forensic examiners toward the footprints of perpetrators left in a virtual environment. However, suspect identification is a big challenge in network forensics due to the anonymous nature of data transmission across the network. This study utilises the decision tree classification approach to characterise users from their behavioural web navigation pattern using the meta-data of captured network packets (Destination IP, Protocol, Port Source, and Port Destination). A total of 95,795,379 network packet headers from 96 subjects were successfully collected. Their meta-data header packets were statistically profiled to generate digital fingerprints that try to link their action on the network to their identity accurately. Hence, CHAID decision tree modelling using Destination IP, Unique protocols, and a combination of the two, including Port source and Port destination, resulted in an accuracy of 4.07%, 6.34%, and 6.36%, respectively. However, the modelling could not create a reliable decision tree for the Port source and destination. The validation study on all the combined variables had a similar accuracy of 6.36%, indicating model created had reproducibility capability. Despite the outcome, the proposed method is not yet sufficiently strong for suspect identification. Further enhancement to improve its accuracy is required.