Deteksi Dini Kelulusan Mahasiswa untuk Mata Kuliah yang Diambil Menggunakan Data Demografi dan Akademik

Limanto, Susana (2024) Deteksi Dini Kelulusan Mahasiswa untuk Mata Kuliah yang Diambil Menggunakan Data Demografi dan Akademik. Doctoral thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 7025211020-Dissertation.pdf] Text
7025211020-Dissertation.pdf - Accepted Version
Restricted to Repository staff only until 1 October 2026.

Download (3MB) | Request a copy

Abstract

Persentase kelulusan mata kuliah dapat ditingkatkan dengan memberikan pendampingan kepada mahasiswa yang membutuhkan. Daftar mahasiswa yang membutuhkan pendampingan dapat diperoleh dari prediksi terhadap histori kinerja akademik. Hasil ini akan sangat bermanfaat apabila model prediksi mempunyai kinerja yang bagus dan dapat diperoleh di awal perkuliahan. Namun, tidak semua fitur yang berpengaruh signifikan terhadap model prediksi dapat diperoleh sebelum perkuliahan. Selain itu, data histori kinerja akademik mahasiswa yang imbalanced beresiko menurunkan kinerja model prediksi.
Kondisi imbalanced dapat diatasi dengan oversampling data sintetis. Namun, saat ini teknik oversampling untuk kombinasi fitur kualitatif dan kuantitatif terbatas jumlahnya. Teknik yang ada hanya melakukan perluasan daerah keputusan sampel minoritas saat pembangkitan sampel sintetis. Hal ini rawan terjadinya noise dan borderline sample.
Pada penelitian ini dikembangkan dua model prediksi menggunakan data demografi dan kinerja akademik mahasiswa. Data untuk prediksi terdiri dari 2,346 sampel dengan label Lulus dan 184 sampel dengan label Gagal. Setiap sampel tersusun atas empat fitur kualitatif dan sebelas fitur kuantitatif. Model prediksi pertama dijalankan sebelum perkuliahan sebagai deteksi dini kelulusan mahasiswa dalam mata kuliah. Sedangkan model kedua dijalankan setelah Ujian Tengah Semester (UTS) dengan menambahkan fitur nilai UTS dan jumlah absensi dalam perkuliahan setengah semester pertama untuk meningkatkan kinerja model. Hasil prediksi kedua dapat dimanfaatkan untuk memantau perkembangan akademik mahasiswa dalam perkuliahan. Selanjutnya, dikembangkan teknik oversampling, Global and Local Weighting on SMOTE-Discrete Continuous (GLoW SMOTE-DC) untuk meningkatkan kinerja model prediksi. Kedua aktifitas ini menjadi kontribusi dari penelitian disertasi.
Hasil penelitian menunjukkan: (1) dosen dan IPK merupakan fitur yang berpengaruh signifikan terhadap model prediksi pertama di semua minority rate dan teknik oversampling berdasarkan tiga level pertama model DT, sedangkan UTS dan dosen berpengaruh signifikan pada model kedua; (2) oversampling meningkatkan kinerja sampel minoritas, namun menurunkan kinerja sampel mayoritas; (3) mayoritas peningkatan kinerja GLoW SMOTE-DC lebih tinggi namun penurunannya lebih rendah dibandingkan ROS, SMOTE-N, dan SMOTE-ENC. Hal ini dimungkinkan karena GLoW SMOTE-DC menerapkan dua kali seleksi dan pembobotan untuk mempercepat peningkatan kemampuan belajar sampel minoritas dan mengurangi noise. Jadi dapat disimpulkan bahwa teknik oversampling yang diusulkan dapat digunakan untuk mendukung deteksi dini dan memantau kelulusan mahasiswa dalam mata kuliah.
========================================================================================================================
The percentage of passing courses can be increased by assisting students needing them. The list of students who need assistance can be obtained from predictions of historical academic performance. These results will be very useful if the prediction model performs well and can be obtained at the beginning of the lecture. However, not all features that significantly influence the prediction model can be obtained before the lecture. In addition, historical data on student academic performance that is imbalanced is at risk of reducing the prediction model's performance.
Imbalanced conditions can be overcome by oversampling synthetic data. However, currently, oversampling techniques for combinations of qualitative and quantitative features are limited in number. Existing techniques only expand the minority sample decision area while generating synthetic samples. This circumstance can introduce both noise and borderline samples.
This research developed two prediction models using demographic and student academic performance data. Data for prediction consists of 2,346 samples with a Pass label and 184 samples with a Fail label. Each sample is composed of four qualitative features and eleven quantitative features. The first model is applied before the lecture as an early predictor of student success in the courses undertaken. Meanwhile, the second model is carried out after the midterm exam (UTS) by adding features of UTS scores and the number of lecture absences in the first half semester. The prediction results of the second model can be used to monitor students' academic progress in lectures. Furthermore, an oversampling technique, Global and Local Weighting on SMOTE-Discrete Continuous (GLoW SMOTE-DC), was developed to improve the prediction model's performance. These two activities are contributions to dissertation research.
The research results show: (1) lecturers and GPA are features that have a significant influence on the first prediction model in all minority rates and oversampling techniques based on the first three levels of the DT model, while UTS and lecturers have a significant influence on the second model; (2) oversampling improves the performance of the minority sample but decrease the performance of the majority sample; (3) the majority increases the performance of GLoW SMOTE- DC is higher but the decline is lower than ROS, SMOTE-N, and SMOTE-ENC. The advantages achieved by GLoW SMOTE-DC are due to the application of two-time selection and weighting, which accelerates the improvement of minority sample learning and reduces the amount of noise. So, the proposed oversampling technique can be used to support the early detection process and monitor students' passing from the courses taken.

Item Type: Thesis (Doctoral)
Uncontrolled Keywords: class imbalanced, academic data, demographic data, early detection, passing course, class imbalanced, data akademik, data demografi, deteksi dini, kelulusan mata kuliah
Subjects: L Education > L Education (General)
Q Science > Q Science (General) > Q325.5 Machine learning. Support vector machines.
Divisions: Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Informatics Engineering > 55001-(S3) PhD Thesis (Comp Science)
Depositing User: Susana Limanto
Date Deposited: 02 Aug 2024 06:01
Last Modified: 02 Aug 2024 06:01
URI: http://repository.its.ac.id/id/eprint/112265

Actions (login required)

View Item View Item