Implementasi Prediksi Siswa Dropout pada MOOC Menggunakan Metode Stacking Super Learner dalam Lingkungan Komputasi Berkinerja Tinggi

Rahmi, Mery Yulinda (2023) Implementasi Prediksi Siswa Dropout pada MOOC Menggunakan Metode Stacking Super Learner dalam Lingkungan Komputasi Berkinerja Tinggi. Other thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 05211940000003-Undergraduate_Thesis.pdf] Text
05211940000003-Undergraduate_Thesis.pdf - Accepted Version
Restricted to Repository staff only until 1 October 2025.

Download (2MB) | Request a copy

Abstract

Permasalahan utama di berbagai platform MOOC (massive open online course) yaitu tingginya tingkat dropout yang bahkan dapat mencapai 91%–93%. Hal ini tentu berdampak terhadap profitabilitas bisnis MOOC. Oleh sebab itu, diperlukan model prediksi siswa dropout pada MOOC untuk memungkinkan adanya intervensi pencegahan dropout. Namun, besarnya ukuran data siswa MOOC membuat proses pemodelan tersebut memerlukan komputasi yang tinggi. Dengan melihat permasalahan tersebut, maka tugas akhir ini membangun model prediksi menggunakan metode stacking yang mutakhir, yakni Super Learner, dan dikomputasikan secara paralel menggunakan GPU atau CPU dalam lingkungan komputasi berkinerja tinggi. Pembelajar dasar yang menyusun model Super Learner meliputi Logistic Regression, KNN, SVM, Naïve Bayes, Random Forest, dan XGBoost, sedangkan meta-learner yang dieksperimenkan adalah NNloglik (non-negative binomial likelihood maximization) dan AUC-maxim (AUC maximization). Hasil eksperimen menunjukkan bahwa Super Learner dengan meta-learner AUC-maxim maupun NNloglik berhasil mengungguli kinerja model pembelajar dasar dan model yang menggunakan metode stacking lainnya, yaitu Stacked Generalization. Kedua model tersebut mencapai skor F1 secara berurutan sebesar 0,90147 dan 0,90126. Di samping itu, ditemukan bahwa paralelisasi GPU pada percobaan ini menghasilkan speedup komputasi hingga 2,4–23,3 kali lebih unggul daripada paralelisasi CPU.
====================================================================================================================================
The main problem in various MOOC (massive open online course) platforms is the high dropout rate which can even reach 91%–93%. This certainly impacts the profitability of the MOOC business. Therefore, there is a need for a dropout prediction model in MOOCs to enable dropout prevention interventions. However, the large size of MOOC student data makes the modeling process require high computation. In view of these problems, this final project builds a prediction model using a state-of-the-art stacking method, namely Super Learner, and computes it in parallel using GPU or CPU in a high-performance computing environment. The base learners that compose the Super Learner model include Logistic Regression, KNN, SVM, Naïve Bayes, Random Forest, and XGBoost, while the meta-learners experimented with are NNloglik (non-negative binomial likelihood maximization) and AUC-maxim (AUC maximization). The experimental results show that Super Learner with AUC-maxim and NNloglik meta-learners successfully outperformed the performance of the base learner model and model using another stacking method, that is Stacked Generalization. Both models achieved F1 scores of 0,90147 and 0,90126, respectively. In addition, it was found that GPU parallelization in this experiment resulted in a computational speedup of up to 2,4–23,3 times superior to parallelization on the CPU.

Item Type: Thesis (Other)
Uncontrolled Keywords: Komputasi Berkinerja Tinggi, High Performance Computing, MOOC Dropout, Stacking, Super Learner
Subjects: Q Science > Q Science (General) > Q325.5 Machine learning. Support vector machines.
Divisions: Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Information System > 57201-(S1) Undergraduate Thesis
Depositing User: Mery Yulinda Rahmi
Date Deposited: 15 Aug 2023 04:42
Last Modified: 15 Aug 2023 04:42
URI: http://repository.its.ac.id/id/eprint/102459

Actions (login required)

View Item View Item