Sistem Deteksi Intrusi Menggunakan Metode Klasifikasi Berbasis Centroid dan Sub-Centroid Tetangga Terdekat

Setiawan, Bambang (2020) Sistem Deteksi Intrusi Menggunakan Metode Klasifikasi Berbasis Centroid dan Sub-Centroid Tetangga Terdekat. Doctoral thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 05111560010005-Dissertation.pdf]
Preview
Text
05111560010005-Dissertation.pdf

Download (2MB) | Preview

Abstract

Pendeteksian intrusi dalam lalu lintas jaringan komputer menjadi tantangan bagi para peneliti selama bertahun-tahun. Kemajuan di bidang pembelajaran mesin memberikan kesempatan kepada peneliti untuk mendeteksi intrusi jaringan tanpa menggunakan basis data signature. Keseimbangan antara kecepatan dan ketepatan merupakan fokus utama dari sistem deteksi intrusi, dimana aspek ketepatan ditinjau dari ukuran accuracy dan completeness dalam mendeteksi serangan. Jumlah data pelatihan pada setiap jenis serangan yang tidak berimbang dapat menyebabkan sistem deteksi intrusi memiliki accuracy yang tinggi tetapi sulit untuk mengenali semua jenis serangan, sehingga aspek completeness tidak terpenuhi. Sudah banyak model deteksi intrusi yang dikembangkan menggunakan teknik pembelajaran mesin, baik menggunakan algoritma tunggal maupun hybrid, tetapi umumnya masih menghasilkan nilai false negative rate dan false positive rate yang masih tinggi serta belum dapat mendeteksi semua jenis serangan. Hal tersebut juga terjadi pada model deteksi intrusi L-SCANN yang berbasis centroid-based classification. Metode centroid-based classification merupakan salah satu bentuk pendekatan hybrid machine learning yang spesifik untuk meningkatkan kecepatan proses klasifikasi, dengan menggunakan dua komponen yaitu algoritma pengklaster dan algoritma pengklasifikasi. Untuk meningkatkan ketepatan deteksi L-SCANN, dalam penelitian ini dibangun pendekatan baru model deteksi intrusi dengan menggabungkan sejumlah metode. Pada tahapan praproses, metode normalisasi Log serta metode seleksi fitur MRIGFS dan RWIGFS diajukan untuk menangani masalah imbalanced-class. Optimasi nilai parameter k dari L-SCANN untuk meningkatkan ketepatan prediksi. Selanjutnya penggabungan L-SCANN dengan SVM-OP dan SVM-OW dengan teknik ensemble-voting dilakukan untuk memvalidasi prediksi false negative dan false positive. Dimana SVM-OP adalah SVM dengan optimasi kernel RBF, dan SVM-OW merupakan cost learning SVM dengan optimasi bobot kelas. Pengujian kinerja model dilakukan menggunakan dataset NSL-KDD dan dataset Kyoto2006++. Hasil uji coba dengan dataset NSL-KDD menunjukkan bahwa penerapan ensemble-voting menggunakan SVM-OP dan SVM-OW dapat meningkatkan kinerja deteksi dari L-SCANN. Accuracy, sensitivity, dan specificity ditingkatkan sampai di atas 99.0%, FPR dan FNR diturunkan sampai di bawah 0.3%, serta sensitivity pada minority class R2L dan U2R meningkat menjadi 94% dan 65%. Sedangkan hasil uji coba dengan dataset Kyoto2006++ menghasilkan accuracy, sensitivity, dan specificity di atas 99.0%, dengan FPR dan FNR di bawah 0.25%.
=================================================================================================================================
Detection of intrusions in computer network traffic has been a challenge for researchers for years. Machine learning provide an opportunity for researchers to detect network intrusion without using a signature database. The balance between speed and accuracy is the main focus of intrusion detection systems, where the aspect of accuracy in addition to prioritizing high sensitivity also considers completeness in detecting attacks. The amount of training data on each type of attack that is not balanced can cause intrusion detection systems to have high accuracy, but it is difficult to recognize all types of attacks, so the completeness aspects are not met. Many intrusion detection models are developed using machine learning techniques, both using a single algorithm and hybrid. These models generally produce false-negative rates and false-positive rates that are high and have not able to detect all types of attacks. This conditions also occur in the L-SCANN, intrusion detection model based on centroid-based classification. Centroid-based classification method is a form of hybrid machine learning approach that is specific to increase the speed of the classification process, using two components, the clustering algorithm and the classification algorithm. To improve the accuracy of L-SCANN detection, a new approach to intrusion detection models was built in this study by combining a number of methods. At the preprocessing stage, and two feature selection methods (MRIGFS and RWIGFS) and the Log normalization method are proposed to deal with imbalanced-class problems. Optimize the k parameter value from L-SCANN to improve the accuracy. Furthermore, the merging of L-SCANN with SVM-OP and SVM-OW with ensemble-voting technique is done to validate false-negative and false-positive predictions. Where SVM-OP is SVM with RBF kernel optimization, and SVM-OW is SVM cost learning with class weight optimization. Model performance testing was performed using the NSL-KDD dataset and the Kyoto2006 ++ dataset. The results of experiment with the NSL-KDD dataset indicate that the application of ensemble-voting using SVM-OP and SVM-OW can improve the detection performance of L-SCANN. Accuracy, sensitivity, and specificity are all increased to above 99.0%, FPR and FNR are reduced to below 0.3%, and sensitivity for minority classes R2L and U2R are increased to 94% and 65%. While the trial results with the Kyoto2006 ++ dataset produce accuracy, sensitivity, and specificity above 99%, with FPR and FNR below 0.25%.

Item Type: Thesis (Doctoral)
Additional Information: RDIf 005.8 Set s-1 2020
Uncontrolled Keywords: centroid-based classification, ensemble-voting, feature selection, intrusion detection system, Kyoto2006 ++ dataset, log normalization, NSL-KDD dataset, support vector machine
Subjects: Q Science > QA Mathematics > QA76.9.A25 Computer security. Digital forensic. Data encryption (Computer science)
Q Science > QA Mathematics > QA76.9.D343 Data mining. Querying (Computer science)
Divisions: Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Informatics Engineering > 55001-(S3) PhD Thesis (Comp Science)
Depositing User: Bambang Setiawan
Date Deposited: 31 Jan 2020 07:33
Last Modified: 12 Jul 2024 01:10
URI: http://repository.its.ac.id/id/eprint/74310

Actions (login required)

View Item View Item