MODEL FOR IMPROVING PERFORMANCE OF NETWORK INTRUSION DETECTION BASED ON MACHINE LEARNING TECHNIQUES
Abstract
Digital crimes have increased in number and sophistication affecting the networks quality of services parameters like confidentiality, integrity and availability of resources. Network Intrusion Detection Systems (NIDS) are deployed to optimize detection and provide comprehensive view of intrusion activities. However, NIDSs generates large volumes of alerts mixed with false positives, and repeated warnings for the same attack, or alert notifications from erroneous activity. This prevents Security Analyst in evaluating the severity of each attack and selecting suitable response plan to prevent information and resources‘ loss in the network at the right time. To achieve high accuracy while lowering false alarm rates there are major challenges in designing an intrusion detection system. To address this issue, this work proposes a three-level model for network Intrusion detection that offers multiple types of correlations. In the first level, several feature selection techniques are integrated to find the best set of features used in this work. The existing feature selection techniques includes Correlation Feature Selection (CFS) based evaluator with Best-first searching method, Information Gain (IG) based Attributes Evaluator with ranker searching method, and Chi square and ranker searching method. The second level enhances the structural based alert correlation model to improve the quality of alerts and detection capability by grouping alerts with common attributes based on unsupervised learning techniques. This work compares four unsupervised learning algorithms namely Selforganizing maps (SOM), K-means, Expectation and Maximization (EM) and Fuzzy C-means (FCM) to select the best cluster algorithm based on Clustering Accuracy Rate (CAR), Clustering Error (CE) and processing time. Then an anomaly classification module is designed in the third level based on fusion of five heterogeneous classifiers Support Vector Machine (SVM), Instance based Learners (IBL), Random Forest, J48, and Bayes Net using Voting as a Multi-Classifier. Network Intrusion Detection model based on hybridizing machine learning techniques (feature selection, enhanced structural and enhanced causal) is implemented on WEKA platform. This research is executed through a series of experiments and testing to achieve the goal of the research. The controlled experiment is preferred as the main method due to certain characteristics, such as performance measures, dataset evaluations and the usability of the results. The NSL KDD and UNSW-NB15 dataset are evaluated based on five measures, detection accuracy, False Positive Rate (FPR), Precision, Total Accuracy (TA), and F– Measures (FM). The results of the proposed model are compared with recent alert correlation models. The overall detection rate is 99.9%, false error rate 0.1% and execution rate of 1340.7 seconds. This shows that HAC is effective and practical in providing complete correlation even on high dimensionality, large scaled and low-quality dataset used in intrusion detection system.