Klasifikasi Financial Fraud Menggunakan Ensemble Oversampling dan Optimasi Hyperparameter Deep Learning untuk Mengatasi Imbalanced Data

Pratama, Moch Deny (2023) Klasifikasi Financial Fraud Menggunakan Ensemble Oversampling dan Optimasi Hyperparameter Deep Learning untuk Mengatasi Imbalanced Data. Masters thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 6025211038-Master Thesis.pdf] Text
6025211038-Master Thesis.pdf - Accepted Version
Restricted to Repository staff only until 1 October 2025.

Download (2MB) | Request a copy

Abstract

Kasus klasifikasi Financial Fraud seperti credit card fraud dan bitcoin fraud memiliki masalah imbalanced data sehingga diperlukan data oversampling dari kelas fraud. Transaksi financial dapat memiliki atribut yang berbeda. Dalam transaksi credit card, atribut dapat berupa jumlah nominal, informasi periode transaksi, status simpanan atau jenis lainnya seperti penarikan atau pengembalian dana, dan informasi yang lebih detail. Dalam transaksi financial bitcoin, atribut dapat mewakili jumlah node, biaya transaksi, volume output, dan angka agregat. Karakteristik data Financial Fraud yang beragam memerlukan metode oversampling yang adaptif agar model klasifikasi dapat berjalan dengan baik. Ensemble Oversampling memanfaatkan kombinasi dari kedua teknik oversampling dapat memperoleh manfaat keunggulan dari masing-masing teknik menciptakan variasi yang lebih besar dalam data yang dihasilkan. Pendekatan generative mampu menghasilkan data yang lebih variatif dan realistis, sedangkan pendekatan traditional dapat mempertahankan karakteristik dan struktur data asli. Metode Ensemble Oversampling diusulkan sebagai pendekatan konteks umum untuk menangani klasifikasi Financial Fraud dalam credit card dan bitcoin. Metode yang diusulkan menggabungkan pendekatan oversampling generatif dengan pendekatan tradisional seperti GAN, SMOTE, dan ADASYN. Pada langkah klasifikasi, algoritma Deep Learning seperti CNN dan LSTM diterapkan untuk memberikan kinerja yang lebih baik. Genetic Algorithm digunakan untuk mengoptimalkan hyperparameter pada Deep Learning. Evaluasi dilakukan dengan membandingkan skenario yaitu tanpa oversampling, menggunakan oversampling dengan GAN, SMOTE, ADASYN dan ensemble oversampling. Gabungan oversampling GAN dan SMOTE dengan model classifier CNN menghasilkan skor evaluasi tertinggi dari semua skenario dengan rata-rata nilai F1-Score sebesar 0,995 dan Kappa Statistics sebesar 0,990. Hal ini menunjukkan bahwa kualitas data yang ditambahkan dapat mempengaruhi kinerja prediksi, dan teknik Oversampling Ensemble dapat dipertimbangkan untuk meningkatkan kinerja klasifikasi dalam data Financial Fraud.
=================================================================================================================================
Credit card fraud and bitcoin fraud are two examples of Financial Fraud classification instances with highly imbalanced data, necessitating the oversampling of fraud class data. Different characteristics may be present in financial transactions. The attributes in a credit card transaction could include a nominal amount, details about the transaction time, the current state of deposits or other types of transactions like withdrawals or refunds, as well as more specific information. The attributes in a Bitcoin transaction could include the number of nodes, transaction cost, output volume, or any of these. In order for the classification model to function successfully, various characteristics of Financial Fraud data necessitate an adaptable oversampling strategy. Ensemble Oversampling utilizing a combination of both oversampling techniques can benefit from the advantages of each technique by creating greater variation in the resulting data. The generative approach can produce more varied and realistic data, while the traditional approach can maintain the characteristics and structure of the original data. As a general context approach to dealing with Financial Fraud classification in credit cards and bitcoin, an ensemble oversampling method is proposed. The suggested methodology combines generative with traditional approaches like GAN, SMOTE, and ADASYN. Deep Learning algorithms like CNN and LSTM are used in the classification step to improve performance. Deep Learning hyperparameters are optimized using a genetic algorithm approach. The evaluation was conducted by comparing four scenarios: ensemble oversampling, applying oversampling with GAN, SMOTE, ADASYN, and without oversampling. The highest evaluation score of all scenarios is produced by the oversampling of GAN and SMOTE combined with the CNN classifier model, with an average F1-Score value of 0.995 and Kappa Statistics of 0.990. It shows that data quality does have an impact on prediction performance, and ensemble oversampling techniques can be used to enhance classifier performance in data on Financial Fraud.

Item Type: Thesis (Masters)
Uncontrolled Keywords: Klasifikasi Financial Fraud, Imbalanced Data, Ensemble Oversampling, Deep Learning, Financial Fraud Classification
Subjects: T Technology > T Technology (General) > T57.5 Data Processing
Divisions: Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Informatics Engineering > 55101-(S2) Master Thesis
Depositing User: Moch Deny Pratama
Date Deposited: 04 Oct 2023 04:36
Last Modified: 04 Oct 2023 04:36
URI: http://repository.its.ac.id/id/eprint/102637

Actions (login required)

View Item View Item