Sistem Deteksi Malicious Events Dengan Pendekatan Continuous Retraining Dan Distribution Shift Detection

Diani, Nabila A'idah (2025) Sistem Deteksi Malicious Events Dengan Pendekatan Continuous Retraining Dan Distribution Shift Detection. Other thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 5025211032-Undergraduate.pdf] Text
5025211032-Undergraduate.pdf - Accepted Version
Restricted to Repository staff only

Download (4MB) | Request a copy

Abstract

Berdasarkan data dari AV-TEST Institute, pada tahun 2021, sistem operasi Windows diserang lebih dari 100 juta malware, perangkat lunak yang dirancang guna merusak system suatu device dan bersifat anomali. Dengan seiring berkembangnya waktu, maka pola anomaly pun juga terus berubah sehingga terjadi pergeseran distribusi pada data. Untuk mengatasi ancaman ini, maka diperlukan pendeteksian aktivitas bertujuan mengindentifikasi anomaly dengan cara melakukan pelatihan berbasis data benign pada model machine learning, seperti Autoencoders, Isolation Forest, Local Outlier Factor, dan One-Class Support Vector Machine, lalu diujikan melalui pembandingan dengan data yang telah tercampur baik oleh data malware maupun benign. Pada prosesnya, dilakukan empat skenario: (1) menggunakan dataset control untuk training dan diuji dengan dataset control dan treatment, (2) mencampurkan dataset control dan treatment pada tahap training dan testing, (3) menggunakan hanya dataset treatment untuk training dan testing, serta (4) melakukan retraining berkelanjutan sebanyak 10 iterasi, di mana hasil prediksi digunakan sebagai data latih pada iterasi berikutnya. Selain itu, digunakan dua kombinasi fitur, yaitu 19 fitur dan 12 fitur terpilih. Berdasarkan hasil penelitian, model Isolation Forest menunjukkan performa f1-score yang tinggi dan stabil, yaitu di atas 0,85 pada skenario retraining berkelanjutan dengan 12 fitur.
==============================================================================================================================================================
According to AV-TEST Institute, in 2021, Windows operating system was attacked more than 100 million times by malware, software designed to damage a device’s system and considered as anomalous behaviour. As time progresses, anomaly pattern also evolves, causing an inevitable distribution shift in the data. To address this threat, anomaly detection is required, aiming to identify anomalies by training on benign data using machine learning models, such as Autoencoders, Isolation Forest, Local Outlier Factor, and One-Class Support Vector Machine, with the following stage being to test the data by comparing it with a dataset that combines between malware and benign samples. In the process, there are four scenarios conducted: (1) training on the control dataset while testing on both control and treatment datasets, (2) merging the control and treatment datasets and using them for both training and testing, (3) utilize only the treatment dataset for the training and testing, and lastly (4) implement continuous training in 10 iterations, where the prediction results are used as a training data for the next iteration. To add, there are two features’ combinations, consisting of 19 and 12 selected features. Based on the experiments, Isolation Forest demonstrated the highest and most consistent f1-score performance, achieving 0,85 through continuous retraining while using 12 features.

Item Type: Thesis (Other)
Uncontrolled Keywords: Malware, Deteksi anomali, Pergeseran distribusi, Retraining berkelanjutan Malware, Outlier detection, Distribution drift, Continuous retraining.
Subjects: T Technology > T Technology (General) > T57.5 Data Processing
T Technology > T Technology (General) > T58.5 Information technology. IT--Auditing
Divisions: Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Informatics Engineering > 55201-(S1) Undergraduate Thesis
Depositing User: Nabila A'idah Diani
Date Deposited: 22 Jul 2025 03:13
Last Modified: 22 Jul 2025 03:13
URI: http://repository.its.ac.id/id/eprint/120413

Actions (login required)

View Item View Item