Optimasi Deteksi Fraud Menggunakan Metode Oversampling dan Analisis Data Panel dalam Klasifikasi dan Regresi

Adiba, Jordan Istiqlal Qalbi (2025) Optimasi Deteksi Fraud Menggunakan Metode Oversampling dan Analisis Data Panel dalam Klasifikasi dan Regresi. Masters thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 6025231026-Master_Thesis.pdf]

Text
6025231026-Master_Thesis.pdf - Other
Restricted to Repository staff only
Download (4MB) | Request a copy

Abstract

Deteksi penipuan dalam laporan keuangan merupakan tantangan utama dalam menjaga stabilitas ekonomi dan kepercayaan investor. Ketidakseimbangan label kelas, di mana kasus fraud sangat jarang jika dibandingkan data normal, menjadi hambatan dalam membangun model deteksi yang akurat. Untuk mengatasi masalah ini, penelitian ini mengusulkan pendekatan berbasis analisis data panel dan oversampling generatif, khususnya pengembangan Wasserstein Generative Adversarial Network dengan Gradient Penalty (WGANGP) yang di optimasi menggunakan mekanisme threshold guna mengontrol kualitas dan distribusi data sintetik. Penelitian dilakukan terhadap 5.236 data laporan keuangan dari perusahaan manufaktur di Indonesia, dilabeli dengan pendekatan Balanced Scorecard (BSC) dalam empat kelas: normal, alarm, risky, dan fraud. Oversampling dilakukan menggunakan beberapa metode statistik SMOTE dan varian Generative Adversarial Network (GAN), Conditional GAN(CGAN), Wasserstein GAN (WGAN), Relativistic GAN (RGAN), PacGAN, dan WGANGP yang dioptimasikan dengan hyperparameter dalam menghasilkan data sintesis. Pengujian oversampling dilakukan pada 3 skenario (1) oversampling berdasarkan label umum, (2) oversampling per-entity per-label (3) oversampling menggunakan nilai kontinu yang terdefinisi range. sehingga didapatkan hasil evaluasi yang beragam pada setiap skenario WGANGP dan SMOTE cukup unggul pada hasil evaluasi Euclidean Distance (ED) dan Fréchet Inception Distance (FID). Berdasarkan evaluasi pengujian klasifikasi dan regresi yang dilakukan, Model WGANGP dengan threshold 0.2 dan 0.8 menunjukkan evaluasi yang lebih stabil dan unggul. Hasil menunjukkan bahwa secara stabil WGANGP dengan threshold 0.2 dan 0.8 menghasilkan evaluasi yang stabil pada setiap model prediktif dengan peningkatan 2%-17% pada evaluasi F1-Score 0.636472 dan 0.632327, Akurasi 0.662042, dan berbagai evaluasi regresi lainnya. Disisi lain hasil menunjukkan bahwa deep learning lebih adaptif terhadap data yang tidak stasioner, sementara model statistik lebih baik dalam kondisi Imbalance. Secara keseluruhan, penelitian ini membuktikan bahwa pendekatan WGANGP dengan threshold dapat menghasilkan data sintetik yang stabil, mengurangi risiko overfitting, dan meningkatkan kinerja deteksi fraud, terutama pada regresi data keuangan. Kombinasi pendekatan generatif dan analisis panel data berkontribusi signifikan terhadap pengembangan sistem deteksi fraud yang lebih efisien dan akurat.
=================================================================================================================================
Detecting fraud in financial reports is a major challenge in maintaining economic stability and investor confidence. The imbalance in class labels, where fraud cases are very rare compared to normal data, poses an obstacle to building accurate detection models. To address this issue, this study proposes an approach based on panel data analysis and generative oversampling, specifically the development of a Wasserstein Generative Adversarial Network with Gradient Penalty (WGANGP) optimized using a threshold mechanism to control the quality and distribution of synthetic data. The study was conducted on 5,236 financial reports from manufacturing companies in Indonesia, labeled using the Balanced Scorecard (BSC) approach into four classes: normal, alarm, risky, and fraud. Oversampling was performed using several statistical methods: SMOTE and variants of Generative Adversarial Network (GAN), Conditional GAN (CGAN), Wasserstein GAN (WGAN), Relativistic GAN (RGAN), PacGAN, and WGANGP, which were optimized with hyperparameters to generate synthetic data. Oversampling testing was conducted across three scenarios: (1) oversampling based on general labels, (2) oversampling per entity per label, and (3) oversampling using continuous values defined within a range. The results showed varied evaluations across each scenario, with WGANGP and SMOTE performing notably well in Euclidean Distance (ED) and Fréchet Inception Distance (FID) evaluations. Based on the evaluation of classification and regression tests conducted, the WGANGP model with thresholds of 0.2 and 0.8 showed more stable and superior evaluation. The results indicate that the WGANGP model with thresholds of 0.2 and 0.8 consistently produces stable evaluations across all predictive models, with improvements ranging from 2%-17% in F1-Score evaluations 0.636472 and 0.632327, accuracy 0.662042, and various other regression evaluations. On the other hand, the results show that deep learning is more adaptive to non-stationary data, while statistical models perform better in imbalanced conditions. Overall, this study demonstrates that the WGANGP approach with thresholds can generate stable synthetic data, reduce the risk of overfitting, and improve fraud detection performance, particularly in financial data regression. The combination of generative approaches and panel data analysis significantly contributes to the development of more efficient and accurate fraud detection systems.

Item Type:	Thesis (Masters)
Uncontrolled Keywords:	Balanced Scorecard, Data Panel, Deteksi Penipuan, GAN, Klasifikasi, Oversampling, Regresi, WGANGP,Balanced Scorecard, Classification, Data Panel, Fraud Detection, GAN, Oversampling, Regression
Subjects:	H Social Sciences > HV Social pathology. Social and public welfare > HV6691 Fraud--Prevention Q Science > QA Mathematics > QA336 Artificial Intelligence Q Science > QA Mathematics > QA76.87 Neural networks (Computer Science) T Technology > T Technology (General) > T57.5 Data Processing
Divisions:	Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Informatics Engineering > 55101-(S2) Master Thesis
Depositing User:	Jordan Istiqlal Qalbi Adiba
Date Deposited:	01 Aug 2025 02:00
Last Modified:	01 Aug 2025 02:00
URI:	http://repository.its.ac.id/id/eprint/122733

Actions (login required)

View Item