Perbandingan Akurasi Model Random Forest, Support Vector Machine, Extreme Gradient Boost, dan Multilayer Perceptron Dalam Mendeteksi Penipuan Klaim Asuransi Kendaraan

Mujiburrahman, Ammar (2025) Perbandingan Akurasi Model Random Forest, Support Vector Machine, Extreme Gradient Boost, dan Multilayer Perceptron Dalam Mendeteksi Penipuan Klaim Asuransi Kendaraan. Other thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 5006211079-Undergraduate_Thesis.pdf]

Text
5006211079-Undergraduate_Thesis.pdf - Accepted Version
Restricted to Repository staff only
Download (3MB) | Request a copy

Abstract

Penipuan dalam klaim asuransi merupakan salah satu tantangan terbesar yang dihadapi oleh industri asuransi karena berdampak signifikan terhadap kerugian finansial perusahaan dan menurunkan kepercayaan masyarakat terhadap layanan asuransi. Sehingga, diperlukan metode yang efektif untuk mendeteksi kecurangan tersebut. Penelitian ini bertujuan untuk membandingkan kinerja algoritma machine learning, yaitu Random Forest (RF), Support Vector Machine (SVM), Extreme Gradient Boosting (XGBoost), dan Multilayer perceptron (MLP) dalam mendeteksi penipuan pada klaim asuransi kendaraan. Selain itu, penelitian ini juga mengevaluasi pengaruh tiga teknik sampling, yaitu undersampling, oversampling, dan Synthetic Minority Oversampling Technique for Nominal and Continuous (SMOTE-NC), dalam menangani ketidakseimbangan data pada dataset klaim asuransi kendaraan. Data yang digunakan merupakan data klaim asuransi kendaraan, periode Agustus tahun 2019 hingga Agustus 2024. Hasil penelitian menunjukkan bahwa model terbaik dalam mendeteksi klaim penipuan pada asuransi kendaraan adalah XGBoost tanpa penggunaan teknik sampling dengan nilai F1-Score 95.72%, recall 93.63%, dan akurasi sebesar 97.98%. Model terbaik selanjutnya adalah Random Forest tanpa penerapan teknik sampling dengan nilai F1-Score 89.27%, recall 82.87%, dan akurasi 95.19%. Performa selanjutnya diikuti oleh model SVM dengan penerapan random oversampling, yang memperoleh nilai F1-Score 65.89%, recall 67.33%, dan akurasi 83.17%. Dan pada urutan terakhir performa kebaikan model adalah MLP dengan penerapan random oversampling yang memperoleh nilai F1-Score 54.13%, recall 47.01%, dan akurasi sebesar 80.77%. Sehingga dapat disimpulkan bahwa model XGBoost tanpa teknik sampling merupakan model terbaik dalam mendeteksi penipuan pada klaim asuransi kendaraan.
====================================================================================================================================
Insurance claim fraud is one of the biggest challenge face by insurance industry, it can lead to financial losses for companies and reducing public trust in insurance services. To address this issue, effective methods for detecting fraudulent claims are crucial. This study focuses on comparing the performance of several machine learning algorithms, such as Random Forest (RF), Support Vector Machine (SVM), Extreme Gradient Boosting (XGBoost), and Multilayer perceptron (MLP) in identifying fraud in vehicle insurance claims. Additionally, the study examines the impact of three sampling techniques which is undersampling, oversampling, and Synthetic Minority Oversampling Technique for Nominal and Continuous (SMOTE-NC) in handling data imbalance within the insurance claim dataset. The dataset used in this study consists of vehicle insurance claims data from the period of August 2019 to August 2024. The results show that the best-performing model for detecting fraudulent insurance claims is XGBoost without the use of any sampling techniques, achieving an F1-Score of 95.72%, a recall of 93.63%, and an accuracy of 97.98%. The next best model is Random Forest, also without sampling, with an F1-Score of 89.27%, recall of 82.87%, and accuracy of 95.19%. This is followed by the SVM model using random oversampling, which achieved an F1-Score of 65.89%, recall of 67.33%, and accuracy of 83.17%. Lastly, the MLP model with random oversampling showed the lowest performance, with an F1-Score of 54.13%, recall of 47.01%, and accuracy of 80.77%.. Thus, it can be concluded that the XGBoost model without any sampling techniques is the best model for detecting fraud in vehicle insurance claims.

Item Type:	Thesis (Other)
Uncontrolled Keywords:	MLP, Penipuan Klaim Asuransi, Random Forest, SVM, Teknik Sampling, XGBoost, Insurance Claim Fraud, Sampling Technique
Subjects:	Q Science > Q Science (General) > Q325.5 Machine learning. Support vector machines.
Divisions:	Faculty of Mathematics, Computation, and Data Science > Actuaria > 94203-(S1) Undergraduate Thesis
Depositing User:	Ammar Mujiburrahman
Date Deposited:	29 Jul 2025 06:44
Last Modified:	29 Jul 2025 06:44
URI:	http://repository.its.ac.id/id/eprint/122573

Actions (login required)

View Item