Prediksi Pelanggan Lapser pada Produk Paket Data di Industri Telekomunikasi Menggunakan Metode Machine Learning

Saputra, Mochamad Gilang (2025) Prediksi Pelanggan Lapser pada Produk Paket Data di Industri Telekomunikasi Menggunakan Metode Machine Learning. Masters thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 6032231011-Master_Thesis.pdf] Text
6032231011-Master_Thesis.pdf
Restricted to Repository staff only

Download (12MB) | Request a copy

Abstract

Kemajuan teknologi informasi dan komunikasi yang cepat telah menghasilkan transformasi besar dalam industri telekomunikasi, terutama dalam hal peningkatan jumlah pengguna internet dan volume data yang dihasilkan. Big data memberikan peluang besar bagi perusahaan telekomunikasi untuk memanfaatkan data pelanggan guna meningkatkan kualitas layanan dan strategi retensi. Salah satu tantangan utama yang dihadapi adalah meningkatnya jumlah pelanggan lapser atau yang berhenti berlangganan produk paket data atau beralih ke penyedia lain. Fenomena ini berdampak langsung pada pendapatan perusahaan dan posisi kompetitif di pasar. Oleh karena itu, sangat penting bagi perusahaan telekomunikasi untuk mengembangkan algoritma prediktif yang mampu secara akurat meramalkan perilaku pelanggan lapser. Penelitian ini bertujuan untuk mengevaluasi metode prediksi pelanggan lapser pada produk paket data di industri telekomunikasi menggunakan metode data balancing dan berbagai algoritma machine learning, seperti Random Forest, Gradient Boosting, Logistic Regression, Decision Tree, Neural Network, dan Catboost. Penelitian ini menggunakan data demografis, pola penggunaan data, dan riwayat transaksi pelanggan. Teknologi big data analytics melalui platform PySpark digunakan untuk mengelola dataset yang besar dan kompleks, serta dapat menghasilkan akurasi prediksi yang lebih baik. Studi ini juga menganalisis elemen-elemen kunci yang mempengaruhi prediksi pelanggan lapser, sehingga dapat membantu perusahaan dalam menyusun strategi pencegahan yang efektif. Hasil penelitian ini berhasil mengidentifikasi faktor-faktor utama yang mempengaruhi pelanggan lapser pada produk paket data di industri telekomunikasi, meliputi faktor demografis (wilayah, jenis kelamin, usia), perilaku penggunaan (usia kartu, konsumsi kuota), dan faktor ekonomi (pendapatan, pembelian paket data). Model Neural Network dengan teknik data balance undersampling dan tuning hyperparameter menggunakan GridSearch menunjukkan performa terbaik, dengan hasil train dan evaluate mencapai Accuracy 75.93%, Recall 76.28%, dan AUC 83.14%. Pada pengujian real test, model ini mampu mendeteksi 84,51% pelanggan lapser (Recall). Hasil ini menunjukkan bahwa model yang dioptimasi dapat menjadi alat prediktif yang efektif untuk strategi retensi pelanggan.
==================================================================================================================================
The swift advancement of information and communication technology has resulted in substantial transformations within the telecommunications industry, especially regarding the rising number of internet users and the volume of data produced. Big data offers a significant opportunity for telecommunications firms to utilize customer data to enhance service quality and retention efforts. A primary difficulty encountered is the increase in lapsed customers, individuals who discontinue data package subscriptions or transition to alternative providers. This phenomenon immediately affects corporate revenue and competitive market standing. Consequently, it is imperative for telecommunications firms to create predictive algorithms that can precisely forecast customer lapse. This research aims to evaluate methods for predicting customer lapse in data package products within the telecommunications industry by applying data balancing techniques and employing various machine learning algorithms, including Random Forest, Gradient Boosting, Logistic Regression, Decision Tree, Neural Network, and CatBoost. The study utilizes demographic data, data usage patterns, and customer transaction histories. Big data analytics, implemented through the PySpark platform, is employed to manage large and complex datasets, thereby improving predictive accuracy. Furthermore, this study analyzes key factors influencing customer lapse predictions, providing valuable insights for companies to develop effective prevention strategies. The findings of this study successfully identify the key factors influencing customer lapse in data package products within the telecommunications industry. These factors include demographic characteristics (region, gender, age), usage behavior (card age, data consumption), and economic factors (income, data package purchases). The Neural Network model, optimized using undersampling data balancing techniques and hyperparameter tuning via GridSearch, demonstrated the best performance, achieving a training and evaluation accuracy of 75.93%, recall of 76.28%, and an AUC of 83.14%. In real-world testing, the model effectively detected 84.51% of lapsed customers (recall). These results indicate that the optimized model can serve as an effective predictive tool for customer retention strategies.

Item Type: Thesis (Masters)
Uncontrolled Keywords: Lapser, Big Data Analytics, Machine Learning, Prediksi, Industri Telekomunikasi, Lapse, Prediction, Telecommunication Industry
Subjects: Q Science > Q Science (General) > Q325.5 Machine learning. Support vector machines.
Q Science > QA Mathematics > QA278.2 Regression Analysis. Logistic regression
Q Science > QA Mathematics > QA76.87 Neural networks (Computer Science)
Q Science > QA Mathematics > QA76.9.D343 Data mining. Querying (Computer science)
Q Science > QA Mathematics > QA76.9.D37 Data warehousing.
Divisions: Interdisciplinary School of Management and Technology (SIMT) > 61101-Master of Technology Management (MMT)
Depositing User: Mochamad Gilang Saputra
Date Deposited: 02 Jun 2025 08:20
Last Modified: 02 Jun 2025 08:20
URI: http://repository.its.ac.id/id/eprint/119116

Actions (login required)

View Item View Item