Deteksi Botnet Spam dengan Cascade Learner dan Ensemble Feature Selection

Irsyad, Fikriansyah Ramadhan (2025) Deteksi Botnet Spam dengan Cascade Learner dan Ensemble Feature Selection. Other thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 5025211149-Undergraduate_Thesis.pdf] Text
5025211149-Undergraduate_Thesis.pdf - Accepted Version
Restricted to Repository staff only

Download (3MB) | Request a copy

Abstract

Botnet merupakan jenis malware yang menginfeksi banyak perangkat dan dikendalikan oleh seorang botmaster untuk menjalankan aktivitas berbahaya, termasuk pengiriman spam. Deteksi botnet, khususnya dalam membedakan lalu lintas normal, botnet non-spam, dan botnet spam, tetap menjadi tantangan utama dalam bidang keamanan siber. Penelitian sebelumnya umumnya berfokus pada klasifikasi lalu lintas jaringan menjadi dua kelas: normal dan botnet. Namun, eksplorasi mengenai klasifikasi multikelas, terutama dalam membedakan botnet spam, masih terbatas dan belum banyak dikaji secara mendalam. Studi ini mengusulkan kerangka kerja klasifikasi cascade learner dua tahap yang dikombinasikan dengan metode seleksi fitur ensemble menggunakan agregasi peringkat untuk meningkatkan akurasi deteksi. Metode seleksi fitur ensemble ini mengintegrasikan berbagai teknik, seperti SelectKBest (Chi-Squared, ANOVA F-Test, Mutual Information), Variance Threshold, Backward Elimination, Recursive Feature Elimination, dan SelectFromModel (berbasis pohon keputusan), dengan hasil peringkat digabungkan menggunakan metode Borda count. Proses klasifikasi dilakukan dalam dua tahap: Tahap 1 membedakan antara lalu lintas normal dan botnet, sedangkan Tahap 2 mengklasifikasikan lebih lanjut lalu lintas botnet menjadi botnet non-spam dan botnet spam. Model ini dievaluasi menggunakan tiga dataset yang umum digunakan, yaitu CTU-13, NCC, dan NCC-2. Hasil eksperimen menunjukkan bahwa penggunaan algoritma Random Forest pada kedua tahap klasifikasi serta sepuluh fitur teratas yang dipilih menghasilkan performa yang sangat tinggi. Model ini mencapai rata-rata nilai macro Precision sebesar 99,77%, Recall 98,87%, F1-score 99,29%, dan F2-score 99,04%, dengan akurasi sebesar 99,93%. Pendekatan yang diusulkan menunjukkan performa terkini (state-of-the-art), khususnya dalam mendeteksi botnet spam, jika dibandingkan dengan penelitian terdahulu.
=================================================================================================================================
A botnet is a type of malware that infects multiple devices and operates under the control of a botmaster to carry out malicious activities, including spamming. Detecting botnets, especially distinguishing between normal traffic, non-spam botnets, and spam botnets, remains a critical challenge in cybersecurity. Previous research has primarily focused on classifying network traffic as benign or botnet-related. However, there has been limited exploration of multiclass classification, particularly in distinguishing spam botnets, with relatively few in- depth studies on this subject. This study proposes a two-stage cascade learner classification framework combined with ensemble feature selection using rank aggregation to enhance detection accuracy. The ensemble feature selection method integrates multiple techniques, including SelectKBest (Chi-Squared, ANOVA F-Test, Mutual Information), Variance Threshold, Backward Elimination, Recursive Feature Elimination, and SelectFromModel (tree- based), with rankings aggregated using the Borda count method. The classification process follows a two-stage approach: Stage 1 differentiates between normal and botnet traffic, while Stage 2 further classifies botnet traffic into non-spam botnets and spam botnets. The model was evaluated on three widely used datasets: CTU-13, NCC, and NCC-2. Experimental results show that using Random Forest (RF) in both classification stages and the top ten selected features yields exceptional performance. The model achieves an average macro Precision of 99,77%, recall of 98,87%, F1-score of 99,29%, and F2-score of 99,04%, with 99,93% accuracy. The proposed approach demonstrates state-of-the-art performance, particularly in spam botnet detection, compared to previous studies.

Item Type: Thesis (Other)
Uncontrolled Keywords: Botnet, Multikelas, Cascade, Ensemble, Klasifikasi
Subjects: T Technology > T Technology (General) > T11 Technical writing. Scientific Writing
T Technology > T Technology (General) > T57.5 Data Processing
T Technology > T Technology (General) > T57.62 Simulation
Divisions: Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Informatics Engineering > 55201-(S1) Undergraduate Thesis
Depositing User: Irsyad Fikriansyah Ramadhan
Date Deposited: 10 Jul 2025 01:09
Last Modified: 10 Jul 2025 01:09
URI: http://repository.its.ac.id/id/eprint/119453

Actions (login required)

View Item View Item