Impact of dataset balancing on machine learning-based intrusion detection systems
Intrusion Detection Systems (IDS) are indispensable for cybersecurity, as they safeguard networks from increasingly sophisticated and sophisticated cyberattacks. This paper assesses the influence of dataset balancing on the performance of machine learning-based IDS, thereby addressing the challenge...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Proceeding Paper |
Language: | English |
Published: |
IEEE
2024
|
Subjects: | |
Online Access: | http://irep.iium.edu.my/114534/7/114534_Impact%20of%20dataset%20balancing.pdf http://irep.iium.edu.my/114534/ https://ieeexplore.ieee.org/document/10675568 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Intrusion Detection Systems (IDS) are indispensable for cybersecurity, as they safeguard networks from increasingly sophisticated and sophisticated cyberattacks. This paper assesses the influence of dataset balancing on the performance of machine learning-based IDS, thereby addressing the challenge of imbalanced data in detecting network intrusions. We concentrate on three IDS implementations: Tree-based Intelligent IDS, Multi-Tiered Hybrid IDS (MTH-IDS), and Leader Class and Confidence Decision Ensemble (LCCDE). We utilized the Synthetic Minority Over-Sampling Technique (SMOTE) to balance data and implemented feature selection and hyperparameter optimization to improve the model's performance using the CICIDS 2017 dataset. The LCCDE model exhibits the highest performance, as our comparative analysis demonstrates that the combination of SMOTE and feature selection enhances the F1 scores. The results underscore the significance of advanced ensemble techniques and data preprocessing in developing resilient IDS. This research emphasizes the necessity of ongoing optimization and evaluation of IDS models to guarantee effective protection against the development of cyber threats. |
---|