Rahmi, Mery Yulinda (2023) Implementasi Prediksi Siswa Dropout pada MOOC Menggunakan Metode Stacking Super Learner dalam Lingkungan Komputasi Berkinerja Tinggi. Other thesis, Institut Teknologi Sepuluh Nopember.
Text
05211940000003-Undergraduate_Thesis.pdf - Accepted Version Restricted to Repository staff only until 1 October 2025. Download (2MB) | Request a copy |
Abstract
Permasalahan utama di berbagai platform MOOC (massive open online course) yaitu tingginya tingkat dropout yang bahkan dapat mencapai 91%–93%. Hal ini tentu berdampak terhadap profitabilitas bisnis MOOC. Oleh sebab itu, diperlukan model prediksi siswa dropout pada MOOC untuk memungkinkan adanya intervensi pencegahan dropout. Namun, besarnya ukuran data siswa MOOC membuat proses pemodelan tersebut memerlukan komputasi yang tinggi. Dengan melihat permasalahan tersebut, maka tugas akhir ini membangun model prediksi menggunakan metode stacking yang mutakhir, yakni Super Learner, dan dikomputasikan secara paralel menggunakan GPU atau CPU dalam lingkungan komputasi berkinerja tinggi. Pembelajar dasar yang menyusun model Super Learner meliputi Logistic Regression, KNN, SVM, Naïve Bayes, Random Forest, dan XGBoost, sedangkan meta-learner yang dieksperimenkan adalah NNloglik (non-negative binomial likelihood maximization) dan AUC-maxim (AUC maximization). Hasil eksperimen menunjukkan bahwa Super Learner dengan meta-learner AUC-maxim maupun NNloglik berhasil mengungguli kinerja model pembelajar dasar dan model yang menggunakan metode stacking lainnya, yaitu Stacked Generalization. Kedua model tersebut mencapai skor F1 secara berurutan sebesar 0,90147 dan 0,90126. Di samping itu, ditemukan bahwa paralelisasi GPU pada percobaan ini menghasilkan speedup komputasi hingga 2,4–23,3 kali lebih unggul daripada paralelisasi CPU.
====================================================================================================================================
The main problem in various MOOC (massive open online course) platforms is the high dropout rate which can even reach 91%–93%. This certainly impacts the profitability of the MOOC business. Therefore, there is a need for a dropout prediction model in MOOCs to enable dropout prevention interventions. However, the large size of MOOC student data makes the modeling process require high computation. In view of these problems, this final project builds a prediction model using a state-of-the-art stacking method, namely Super Learner, and computes it in parallel using GPU or CPU in a high-performance computing environment. The base learners that compose the Super Learner model include Logistic Regression, KNN, SVM, Naïve Bayes, Random Forest, and XGBoost, while the meta-learners experimented with are NNloglik (non-negative binomial likelihood maximization) and AUC-maxim (AUC maximization). The experimental results show that Super Learner with AUC-maxim and NNloglik meta-learners successfully outperformed the performance of the base learner model and model using another stacking method, that is Stacked Generalization. Both models achieved F1 scores of 0,90147 and 0,90126, respectively. In addition, it was found that GPU parallelization in this experiment resulted in a computational speedup of up to 2,4–23,3 times superior to parallelization on the CPU.
Item Type: | Thesis (Other) |
---|---|
Uncontrolled Keywords: | Komputasi Berkinerja Tinggi, High Performance Computing, MOOC Dropout, Stacking, Super Learner |
Subjects: | Q Science > Q Science (General) > Q325.5 Machine learning. Support vector machines. |
Divisions: | Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Information System > 57201-(S1) Undergraduate Thesis |
Depositing User: | Mery Yulinda Rahmi |
Date Deposited: | 15 Aug 2023 04:42 |
Last Modified: | 15 Aug 2023 04:42 |
URI: | http://repository.its.ac.id/id/eprint/102459 |
Actions (login required)
View Item |