Robust adaptive multivariate Hotelling's T2 control chart based on kernel density estimation for intrusion detection system

The utilization of conventional multivariate control chart in network intrusion detection will deal with two main problems. First, the high false alarm occurs due to the distribution of network traffic data that is not following the theory. Second, the inability of the control chart to detect outlie...

Full description

Saved in:
Bibliographic Details
Main Authors: Ahsan, Muhammad, Mashuri, Muhammad, Lee, Muhammad Hisyam, Kuswanto, Heri, Prastyo, Dedy Dwi
Format: Article
Published: Elsevier Ltd 2020
Subjects:
Online Access:http://eprints.utm.my/id/eprint/87242/
http://dx.doi.org/10.1016/j.eswa.2019.113105
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The utilization of conventional multivariate control chart in network intrusion detection will deal with two main problems. First, the high false alarm occurs due to the distribution of network traffic data that is not following the theory. Second, the inability of the control chart to detect outliers caused by the masking effect. To overcome these problems, the multivariate control chart based on the fast minimum covariance determinant (MCD) algorithm and kernel density estimation (KDE) is proposed in this paper. The employment of KDE technique is expected to adaptively follow the network traffic data pattern, thereby reducing the occurrence of false alarms. Meanwhile, the usage of Fast-MCD will improve the capabilities of the proposed control chart to quickly and accurately detect the outliers. For the simulated data, the proposed chart shows a better level of accuracy when it is compared to conventional T2 and other robust T2 based on successive difference covariate matrix (SDSM) charts. For the data generated from some distributions, the proposed chart shows its adaptability by producing low false alarm with high detection rate. The proposed chart shows excellent performance to monitor the KDD99 dataset with 98.61% accuracy, NSL-KDD dataset with 91.71% accuracy, and UNSW-NB 15 dataset with 91.02% accuracy. The proposed method has consistent performance when monitoring the small subset of the datasets, which can minimize the computational time by more than 90% without decreasing its level of accuracy and precision. Also, the performance from the proposed chart surpasses the other benchmarks.