An ensemble-based anomaly-behavioural crypto-ransomware pre-encryption detection model
Crypto-ransomware is a malware that leverages cryptography to encrypt files for extortion purposes. Even after neutralizing such attacks, the targeted files remain encrypted. This irreversible effect on the target is what distinguishes crypto-ransomware attacks from traditional malware. Thus, it is...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2019
|
Subjects: | |
Online Access: | http://eprints.utm.my/id/eprint/98097/1/BanderAliSalehPSC2019.pdf http://eprints.utm.my/id/eprint/98097/ http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:143725 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Crypto-ransomware is a malware that leverages cryptography to encrypt files for extortion purposes. Even after neutralizing such attacks, the targeted files remain encrypted. This irreversible effect on the target is what distinguishes crypto-ransomware attacks from traditional malware. Thus, it is imperative to detect such attacks during pre-encryption phase. However, existing crypto-ransomware early detection solutions are not effective due to inaccurate definition of the pre-encryption phase boundaries, insufficient data at that phase and the misuse-based approach that the solutions employ, which is not suitable to detect new (zero-day) attacks. Consequently, those solutions suffer from low detection accuracy and high false alarms. Therefore, this research addressed these issues and developed an Ensemble-Based Anomaly-Behavioural Pre-encryption Detection Model (EABDM) to overcome data insufficiency and improve detection accuracy of known and novel crypto-ransomware attacks. In this research, three phases were used in the development of EABDM. In the first phase, a Dynamic Pre-encryption Boundary Definition and Features Extraction (DPBD-FE) scheme was developed by incorporating Rocchio feedback and vector space model to build a pre-encryption boundary vector. Then, an improved term frequency-inverse document frequency technique was utilized to extract the features from runtime data generated during the pre-encryption phase of crypto-ransomware attacks’ lifecycle. In the second phase, a Maximum of Minimum-Based Enhanced Mutual Information Feature Selection (MM-EMIFS) technique was used to select the informative features set, and prevent overfitting caused by high dimensional data. The MM-EMIFS utilized the developed Redundancy Coefficient Gradual Upweighting (RCGU) technique to overcome data insufficiency during pre-encryption phase and improve feature’s significance estimation. In the final phase, an improved technique called incremental bagging (iBagging) built incremental data subsets for anomaly and behavioural-based detection ensembles. The enhanced semi-random subspace selection (ESRS) technique was then utilized to build noise-free and diverse subspaces for each of these incremental data subsets. Based on the subspaces, the base classifiers were trained for each ensemble. Both ensembles employed the majority voting to combine the decisions of the base classifiers. After that, the decision of the anomaly ensemble was combined into behavioural ensemble, which gave the final decision. The experimental evaluation showed that, DPBD-FE scheme reduced the ratio of crypto-ransomware samples whose pre-encryption boundaries were missed from 18% to 8% as compared to existing works. Additionally, the features selected by MM-EMIFS technique improved the detection accuracy from 89% to 96% as compared to existing techniques. Likewise, on average, the EABDM model increased detection accuracy from 85% to 97.88% and reduced the false positive alarms from 12% to 1% in comparison to existing early detection models. These results demonstrated the ability of the EABDM to improve the detection accuracy of crypto-ransomware attacks early and before the encryption takes place to protect files from being held to ransom. |
---|