Perbandingan Prediksi Loan Default Pada Kondisi Data Yang Tidak Seimbang Dengan Stacking Ensemble Learning

Az, Achmad Fachreza (2023) Perbandingan Prediksi Loan Default Pada Kondisi Data Yang Tidak Seimbang Dengan Stacking Ensemble Learning. Other thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 06111940000064-Undergraduate_Thesis.pdf] Text
06111940000064-Undergraduate_Thesis.pdf - Accepted Version
Restricted to Repository staff only until 1 October 2025.

Download (2MB) | Request a copy

Abstract

Peer to Peer Lending atau P2P lending populer karena kemudahannya dalam peminjaman uang, namun ada risiko peminjam tidak dapat membayar pinjaman. Oleh karena itu, diperlukan prediksi apakah peminjam dapat membayar pinjaman atau tidak. Penelitian ini bertujuan untuk membandingkan model machine learning yang dibuat dengan beberapa model lain dalam memprediksi loan default, pinjaman yang tidak dapat dibayar. Untuk meningkatkan akurasi model, digunakan algoritma stacking ensemble learning untuk menggabungkan tiga model machine learning: eXtreme Gradient Boosting (XGBoost), Categorical Boosting (CatBoost), dan Light Gradient Boosting Machine (LightGBM). Karena data yang digunakan lebih banyak pinjaman yang dibayar daripada loan default, maka data tidak seimbang. Dari hasil pelatihan model, didapat model stacking mengalami underfit dan membutuhkan beberapa penyesuaian untuk meningkatkan performanya. Didapat model stacking mempunyai nilai recall dan F-1 Score yang lebih baik dibandingkan dengan model XGBoost, CatBoost, dan LightGBM, namun tidak lebih baik bila dibandingkan dengan model XGBoost dan CatBoost dalam nilai precision.
===============================================================================================================================
Peer to Peer Lending, or P2P lending, is popular due to its convenience in borrowing money, but there is a risk that the borrower may not be able to repay the loan. Therefore, it is necessary to predict whether the borrower can repay the loan or not. This study aims to compare a machine learning model with several other models in predicting loan default, loans that cannot be repaid. To improve the accuracy of the model, a stacking ensemble learning algorithm is used to combine three machine learning models: eXtreme Gradient Boosting (XGBoost), Categorical Boosting (CatBoost), and Light Gradient Boosting Machine (LightGBM). Since the data used has more repaid loans than loan defaults, the data is unbalanced. From the training results, the stacking model is experiencing underfit and requires some adjustments to improve its performance. It was found that the stacking model had a better recall and F-1 Score compared to the XGBoost, CatBoost, and LightGBM models, although it falls short in comparison to the XGBoost and CatBoost model in terms of precision.

Item Type: Thesis (Other)
Uncontrolled Keywords: Loan Default, Prediksi, Data Tidak Seimbang, Stacking, Loan Default, Prediction, Imbalanced Data
Subjects: Q Science > Q Science (General) > Q325.5 Machine learning. Support vector machines.
Divisions: Faculty of Science and Data Analytics (SCIENTICS) > Mathematics > 44201-(S1) Undergraduate Thesis
Depositing User: Achmad Fachreza Az
Date Deposited: 08 Sep 2023 00:46
Last Modified: 08 Sep 2023 00:46
URI: http://repository.its.ac.id/id/eprint/103917

Actions (login required)

View Item View Item