Proposing a Model for Detecting Intrusion Network Attacks Using Machine Learning Techniques

: At the present time, the reliance on computers is increasing in all aspects of life, so it is necessary to protect computer networks and computing resources from complex attacks against the network. This is performed by building tools, applications, and systems that detect attacks or anomalies adapting to ever-changing architectures and dynamically changing threats. The goal of this paper is to build a Network Intrusion Detection System (NIDS) based on deep learning techniques such as Convolutional Neural Network (CNN), which demonstrated its efficiency in predicting, classifying


Introduction
Providing network security is one of the most important things in network communications, considering network growth and the increase of devices being added to the network, a fact that demands more requirements to provide network security. A network security system is necessary to protect devices and data belonging to network users, and it also helps protect information shared on the network, protect people's personal information, and helps prevent users from falling victim to pirates [1]. Network security technologies must be constantly developed. Network traffic attack detection systems play an important role in network security by detecting intrusions and the competent authority is alerted. There are two types of attack detection systems on the network: flaw detection and abuse detection [2]. In detecting anomalies and detecting defects the database of normal activity is identified and any deviations from the normal activity are alerted to the occurrence of intrusion or attacks in the network. Abuse detection detects the sorts of attacks in the database and if the same types of possibilities exists in the network, after which they are classified as attacks or hacks [3]. Several AI methods, such as rule-based and data mining approaches, have been developed for an intrusion detection system, and it is the first suggested framework for IDSs. It is a way of gathering information from a large-scale database that aids in the extraction of patterns from the knowledge base, IDS, and the use of this knowledge to forecast the occurrence of intrusion. [4]. In these types of systems, there are drawbacks that do not have the ability to detect new attacks, so many machine learning (ML) algorithms have been developed to promote AIDS and they are MLAIDS [5], these algorithms assess the network's health by categorizing and processing data as normal or abnormal. The second section in this paper includes the previous works of researchers in the same field. The third section explains the data used. The fourth section includes the methodology. The fifth section includes the results and discussion, and the last section includes conclusion. The main objective of the paper is to design a software tool or system that detects intrusions or attacks across the network using machine learning techniques. We also aim to improve the system over time through automatic learning and using the latest deep learning algorithms to discover new attacks.

Previous works
Because of the wide use of computer networks in various fields of life, network security including intrusion detection systems, is a source of interest for many researchers. In 1998 and 1999 Cup was applied in 42% and 20% of these studies respectively [6]. Since 1999, with the emergence of different algorithms and techniques used in the field of intrusion detection systems, many researchers have used the KDD Cup, which is the most widely used dataset in studies of network-based intrusion detection systems [7]. In 2019, Y. Xiao et al. [8] used Batch Normalization with the KDD99 Dataset to construct a CNN-based IDS using an auto-encoder (AE) network as a dimensionality reductional technique. The suggested framework also deleted unneeded and superfluous features to monitor network and host level activities. The authors of [9] present two approaches for detecting DDoS attacks on SDN. SVM and Deep Neural Network (DNN) approaches were then utilized to categorize the attack in the first phase, which used a signature-based anomaly detection system for network data. The KDDCUP99 dataset was then utilized for training and detection by the authors. The trial findings showed that DNN outperforms SVM, with accuracy rates of 92.30% and 74.30% respectively. Vinaya-kumar et al. [10] developed a hybrid IDS. DNN outperformed other standard machine learning classifiers after an exhaustive comparison study using several machine learning and deep learning classifiers. Yin and colleagues [11] used RNN the DL algorithm, the input to a network is 122-dimensional neurons, when tested on the NSL-KD data set, their model achieved an accuracy score of 83.28% in binary classification and 81.29% in multiple classifications. Cui Zihua et al. code in pictures for classification using CNN implemented with multiple malware image sizes (24*24, 48*48, 96*96, 192*192) and CNN. Riyaz et al. [12] used the KDD-99 dataset to create an IDS using a CNN architecture for use in wireless networks. A unique coefficient-based feature selection technique (CRF-LCFS) was used in the framework, which improved the model's detection accuracy and computation times. The proposed method had a detection accuracy of 98.9% and a false alarm rate of less than 1%, according to the study.

The dataset
In this paper, we used the standard datasets of the KDD Cup "99". DARPA generated the KDD'99 dataset in 1999, based on network traffic from the 1998 dataset. For each network connection, it is running 41 Pre-processed features. There are four groups of features in the KDD'99 data collection. Core features (#1 to #9), content features (#10 to #22), time-based traffic features (#23 to #31), and host depend on Traffic features (#32 to #41) are just a few examples. KDD'99 [13] has a total of 4,898,430 records. They are used to create traffic by spoofing different IP addresses. The traffic characteristics are then recorded in TCP dump format. A total of seven weeks was allotted for simulation. Normal connections to the supposed IP are established in a military network. There are various attacks, each class containing 21 kinds of properties [14], which fall under four types of attacks: "DOS attacks, Probe attacks, R2L attacks and U2R attacks" [15][16]. The imitated attacks were divided into four categories [17]: 1. Denial of Service Attack (DoS): When an attacker renders a computer or memory resource too busy or full to process valid requests, or refuses legitimate users access to a system, this is known as a DoS attack. 2. User to Root Attack (U2R): This is a type of exploit in which the attacker gains access to a regular user account on the system (perhaps through password sniffing, a dictionary attack, or social engineering) and then exploits a vulnerability to get root access. 3. Remote to Local Attack (R2L): An attacker who has the capacity to transmit packets to a machine across a network but does not have an account on that machine exploits a vulnerability to get local access as a user of that system is known as a remote to local attack (R2L) [18] 4. Probing Attack: is an attempt to obtain information about a network of computers in order to obfuscate the network's security mechanisms [17]

Convolutional Neural Networks CNN
The technique of data representation is crucial to the success of machine learning algorithms [19]. Many researchers, such as Niaz et al., have employed deep neural networks to produce an effective intrusion detection system [18], and these studies build their models to learn represent manually chosen traffic characteristics, rather than fully utilizing deep neural network capabilities [20]. As learning characteristics must be retrieved directly from the raw data, as in computer vision and natural language processing [21], CNN is the most often used deep neural network architecture. CNN utilizes the original data as direct input to the network, does not need feature extraction or picture reconstruction, has a small number of parameters, and processes data. CNNs have been demonstrated to be extremely successful in image recognition and feature extraction [22]. The input layer, the hidden layer, and the output layer are the three fundamental components of a neural network. A CNN's hidden layer is composed of convolutional (non-linear) activation, aggregation (reduction), and fully linked layers. Weight sharing: the weights of connections between a subset of neurons are shared in the same layer, and during sampling, the convergence layer is regularly inserted between succeeding convolutional layers. As a result, CNNs have a lower weight number than other neural networks, making them more efficient at detecting network invasion. In network intrusion detection, Vinayakumar et al. [23] demonstrate that CNN and its diverse structure outperform standard machine learning classifiers. According to the findings, one-dimensional convolution in CNN identifies network intrusions with high accuracy. CNNs were first used in image processing applications as a biologically inspired deep learning model for image classification, face recognition, and pattern recognition [24][25]. Convolution operations use a "filter" "kernel" to automatically extract the complicated properties of patterns from a picture. A CNN Network Architecture is made up of three types of layers:" convolutional layers, pooling layers, and fully connected layers ", see Figure 1. A CNN architecture is formed when these layers are grouped or layered. In addition to these three layers, the dropout layer and the activation function given below [26] are essential factors.

Methodology 4.1. Data pre-processing
The first part of our model is preparing the data, it divides the connections into two types: "attack" and "normal" categories based on the "labels" column. The "attack" category is then broken down into four main categories: DoS, Probe, R2L, and U2R. These classes are then indexed or coded, as we put all types of attacks into a single value, the "attack", to make the problem a binary classification. The next step is data pre-processing in which each connection record has 41 features, 38 numeric characteristics, and 3 category features. The data set's category characteristics are encoded fields (0 to 2). They contain information such as the duration, protocol type, and service. Fields 3-40 include information on the connection's other characteristics, such as the number of unsuccessful logins, the number of files accessed, and the number of exit limits. The connection type is shown in the label column, which includes "normal" and "attack.", as shown in Figure 2, and Figure 3 illustrates the distribution of attacks in the dataset.

Figure. 2
Characteristics of the data extracted from the data of KDD Cup "99" Figure. 3 Distribution of attacks in KDD Cup "99" We employed a pre-processing procedure for this data in which we eliminated certain category elements and encoded others into numeric-only fields. These fields are sent into the CNN deep learning algorithm as inputs. Several tasks are included in this step: 1. Ensure that the data does not include any incorrect characters.

Remove any fields that have empty values or values that do not contain numbers.
3. Remove duplicate columns.
The main reason for pre-processing is because the data is in various forms and was gathered from various locations. It also assures the validity and efficacy of the model being trained on this data.

Feature selection
The first method for feature selection is to eliminate redundant and irrelevant data by fully selecting a subset of relevant features that represent the specific problem. The second method of feature selection is to have a more efficient form of classification, we took advantage of field correlation. In our model, there are several alternatives for picking the functionality, we can define many sets of interconnected variables and only leave one of them. Then we normalize or scale continuous values between all properties, so CNN trains the data in the same workspace, then all numerical features of the data set are normalized using zscores.

Building the Model
The researchers used a CNN model with training data (80% of the data) using some base libraries such as Keras, Pandas, NumPy and Scikit-learn. We made a sequential CNN model with 3 layers. In the convolutional layer we put 64 nodes and kernels of size 3 and the activation function as relu, which is applied for each numeric value and replaces all negative values in the features with zero and the max pooling layer behind the convolutional layer. Although the max pooling layer is not required for a CNN model, we utilize it since there is very little chance of losing key features due to the max pooling as the converted data solely contains numerical information. Additionally, the flatten layer is followed by a dense layer and applied dropout with rate of 0.25. It is placed at the end of the network, which means that every time 75% of the nodes are randomly selected for a process, it will increase the randomness of the model and reduce the bias. Finally, there is a fully connected layer (dense layer) for output. Figure 4 shows a CNN architecture consisting of one convolutional layer, one pooling layer that produces the maximum value of two adjacent elements, two fully connected layers, an input layer followed by a dropout layer, an output layer.

Results and discussion
For our model, "ADAM" optimizer was used, batch size=128, epoch =30, while the activation function was "Sigmoid" for the output layer. Their summaries are given in Figure 5. The tested model was performed with test dataset, then the accuracy of the classifier was analysed as results and performance evaluation is shown in the following figures. The aim was to attain improved detection rates, classify types of attacks and determine the type of connection whether normal or abnormal. It reached an accuracy rate of 99.7%. distinguishing between offensive and normal contact.  Figure 6. The result of accuracy and loss To verify the results obtained through the proposed model, they were compared with the results of previous research, and through the comparison, we note that the results obtained from our model were higher than those of previous research, as shown in the following table.

Conclusions
The goal of the project was to build an IDS by using CNN that can classify the connections in the dataset as an "attack" or a" normal" connection. A binary classification model was built in which we used fully connected neural networks and convolutional neural networks to classify the types of connections and attacks. The system was trained and tested used KDD dataset, classifier accuracy and detection rates and error rates were measured using the PYTHON language and Jupiter Notebooks code editor. The results of practical experiments showed improved performance when using the low-attribute dataset, which was better than when using the full-attribute dataset. We reached an accuracy rate of 98.10% distinguishing between offensive and normal contact.

Acknowledgments
This work was supported by the University of Mosul / College of Computer Sciences and Mathematics. This work is part of the requirements for obtaining a master's degree.