Analisis Customer Churn pada Data Perbankan Menggunakan Teknik Ensambel Berbasis Stacking

Rahmawati, Ariftra (2025) Analisis Customer Churn pada Data Perbankan Menggunakan Teknik Ensambel Berbasis Stacking. Other thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 5026211087-Undergraduate_Thesis.pdf] Text
5026211087-Undergraduate_Thesis.pdf - Accepted Version
Restricted to Repository staff only

Download (4MB) | Request a copy

Abstract

Persaingan industri perbankan yang semakin ketat, disertai meningkatnya kebutuhan dan ekspektasi masyarakat terhadap layanan bank menimbulkan permasalahan penting bagi bank, yaitu adanya customer churn. Customer churn adalah tindakan nasabah meninggalkan layanan suatu bank dan beralih ke penyedia layanan lain. Fenomena ini dapat menyebabkan penurunan pendapatan langsung sekaligus mengancam stabilitas finansial bank jangka panjang. Pengembangan model prediksi churn menjadi langkah penting untuk membantu bank mengidentifikasi nasabah berisiko churn dan menyusun strategi retensi yang lebih efektif. Tugas akhir ini bertujuan untuk mengembangkan model prediksi churn menggunakan teknik ensambel berbasis stacking pada data perbankan. Teknik stacking dipilih karena kemampuannya dalam meningkatkan akurasi prediksi dengan menggabungkan beberapa algoritma pembelajaran mesin yang memiliki karakteristik dan keunggulan berbeda. Metode yang digunakan dalam tugas akhir ini melibatkan kombinasi beberapa algoritma sebagai base learners, yaitu Random Forest (RF), Support Vector Machine (SVM), dan Logistic Regression (LR). Sedangkan kombinasi meta learner yang digunakan meliputi Logistic Regression dan Random Forest. Pendekatan stacking ini diterapkan untuk menemukan kombinasi model yang paling optimal dalam mengenali pola perilaku nasabah berisiko churn. Hasil eksperimen menunjukkan bahwa dari seluruh model individu, RF dengan tuning hyperparameter memberikan performa terbaik pada data uji dengan akurasi 81,00%. Selanjutnya, model stacking dengan meta learner RF (Stacking RF) menghasilkan akurasi tertinggi sebesar 82,75%, sementara model stacking dengan meta learner LR (Stacking LR) mencatat nilai F1-score tertinggi, 0,6024. Analisis feature importance pada Stacking RF mengidentifikasi bahwa umur nasabah, jumlah produk yang dimiliki, dan status keaktifan nasabah merupakan faktor utama yang memengaruhi kemungkinan churn.
======================================================================================================================================
The increasing competition in the banking industry, along with rising customer demands and expectations, has created a major challenge for banks—customer churn, defined as a customer’s decision to leave their current bank for an alternative service provider. This phenomenon can lead to significant revenue loss and pose long-term threats to a bank’s financial health. Accordingly, building an accurate churn prediction model is essential to help banks identify high-risk customers and implement more effective retention strategies. This thesis presents the development of a churn prediction model using a stacking ensemble approach applied to banking data. Stacking was chosen for its ability to improve predictive performance by combining multiple machine learning algorithms, each with distinct strengths. The method involves several algorithms as base learners—Random Forest (RF), Support Vector Machine (SVM), and Logistic Regression (LR)—with Logistic Regression and Random Forest also used as meta learners. The goal is to determine the most effective model combination for capturing behavioral patterns associated with customer churn. Experimental results show that among the individual models, Random Forest with optimized hyperparameters achieved the highest test accuracy of 81.00%. Furthermore, the stacking ensemble with Random Forest as the meta learner (Stacking RF) yielded the highest overall accuracy at 82.75%, while the stacking model using Logistic Regression as the meta learner (Stacking LR) attained the highest F1-score of 0.6024. Feature importance analysis of the Stacking RF model identified customer age, number of owned products, and activity status as the most influential features in predicting churn.

Item Type: Thesis (Other)
Uncontrolled Keywords: Customer Churn, Prediksi Churn, Ensemble Stacking, Data Perbankan, Pembelajaran Mesin, Customer Churn, Churn Prediction, Stacking Ensemble, Banking Data, Machine Learning
Subjects: Q Science > QA Mathematics > QA336 Artificial Intelligence
T Technology > T Technology (General) > T57.5 Data Processing
Divisions: Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Information System > 57201-(S1) Undergraduate Thesis
Depositing User: Ariftra Rahmawati
Date Deposited: 29 Jul 2025 02:53
Last Modified: 29 Jul 2025 02:53
URI: http://repository.its.ac.id/id/eprint/122707

Actions (login required)

View Item View Item