Pengembangan Algoritma Support Vector Machine (SVM) Multiclass untuk Prediktor Kategorik dengan Proportional Class Constraint

Yustanti, Wiyli (2024) Pengembangan Algoritma Support Vector Machine (SVM) Multiclass untuk Prediktor Kategorik dengan Proportional Class Constraint. Doctoral thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 06211960010001-Dissertation.pdf] Text
06211960010001-Dissertation.pdf - Accepted Version
Restricted to Repository staff only until 1 April 2026.

Download (9MB) | Request a copy

Abstract

Terdapat tiga prinsip penting dalam pengelompokan Uang Kuliah Tunggal (UKT). Pertama, pengelompokan UKT harus berdasarkan pada tingkat kemampuan sosial ekonomi orang tua mahasiswa. Kedua, perguruan tinggi harus mencapai pendapatan negara bukan pajak (PNBP) melalui penerimaan pembayaran UKT berdasarkan target yang ditetapkan, dan yang ketiga adalah bahwa PTN sebagai lembaga pemerintah memiliki tanggung jawab sosial untuk menerima minimal 20% mahasiswa dari keluarga kurang mampu. Ketiga prinsip ini menjadi latar belakang utama dalam penelitian ini, sehingga dibutuhkan sebuah pengembangan metode klasifikasi UKT dengan memperhitungkan target penerimaan (revenue) PTN dan persyaratan proporsi minimal pada kelompok UKT rendah. Untuk penyelesaian masalah tersebut, perlu dikembangkan algoritma klasifikasi yang dapat mengakomodasi faktor kendala (constraint). Faktor kendala yang wajib ada adalah proporsi kelas tertentu dari hasil klasifikasi (Proportional Class Constraint) dan jumlah minimal penerimaan UKT (revenue). Tipe variabel prediktor dari studi kasus penelitian ini adalah kategorik. Hasil kajian pustaka mendapatkan bahwa metode klasifikasi untuk prediktor kategorik yang secara langsung dapat digunakan adalah metode berbasis Kernel Density Classification (KDC). Akan tetapi, metode KDC memiliki keterbatasan dalam penambahan constraint pada model klasi-fikasinya, selain itu banyak penelitian yang menunjukkan bahwa kinerjanya masih dapat diungguli oleh metode Support Vector Machine (SVM). Pada metode SVM terdapat keterbatasan yaitu bahwa input algoritma harus bertipe numerik. Dengan demikian penelitian ini memberikan kontribusi, yaitu (1) pemilihan metode encoding prediktor kategorik yang mampu meningkatkan kinerja SVM multiclass pada tahap pre-processing, dan (2) pengembangan algoritma SVM multiclass dengan Proportional Class Constraint (SVM-ProClass). Penelitian ini meng-gunakan ordinal encoding dan menghasilkan algoritma SVM-ProClass dengan dua tahapan utama yaitu fase prediksi kelas dan fase pergeseran kelas (Birth-Death Process). Selanjutnya, algoritma SVM-ProClass diterapkan pada dataset UKT untuk menghasilkan dataset yang besifat separable dan imbalanced berbasis proporsi kelas, dimana kinerja dari dataset hasil SVM-ProClass memiliki nilai akurasi F1-Score rata-rata 99,22% dan berbeda secara signifikan dengan α=5% terhadap kinerja dataset tanpa SVM-ProClass.
===================================================================================================================================
There are three important principles in grouping single tuition fees (UKT). First, the UKT grouping must be based on the socio-economic ability level of the student's parents. Second, universities must achieve non-tax state income (PNBP) through receipt of UKT payments based on set targets, and third, the public university as government institutions have a social responsibility to accept a minimum of 20% of students from underprivileged families. These three principles are the main background for this research, so it is necessary to develop a UKT classification method that takes into account university revenue targets and minimum proportion requirements in the low UKT group. To solve this problem, it is necessary to develop a classification algorithm that can accommodate constraint factors. The constraint factors that must be present are the proportion of a certain class from the classification results (proportional class constraint) and the minimum amount of UKT receipts (revenue). The type of predictor variable used in the case study is categorical. The results of the literature review found that the classification method for categorical predictors that can be directly used is the Kernel Density Classification (KDC) based method. However, the KDC method has limitations in adding constraints to the classification model; apart from that, many studies show that its performance can still be superior to the Support Vector Machine (SVM) method. There is a limitation to the SVM method, namely that the algorithm input cannot be of the categorical type. Thus, this research provides contributions, namely (1) selecting a categorical predictor encoding method that is able to improve multiclass SVM performance at the pre-processing stage and (2) developing a multiclass SVM algorithm with proportional class constraints (SVM-ProClass). This research uses ordinal encoding and produces the SVM-ProClass algorithm with two main stages, namely the class prediction phase and the class shift phase (Birth-Death Process). Next, the SVM-ProClass algorithm is applied to the UKT dataset to produce a separable and imbalanced dataset based on class proportions, where the performance of the SVM-ProClass dataset has an average F1-Score accuracy value of 99.22% and it is significantly different from the performance of the dataset without SVM-ProClass with α=5%.

Item Type: Thesis (Doctoral)
Uncontrolled Keywords: Prediktor Kategorik, Klasifikasi, Support Vector Machine, Multiclass, Proportional Class Constraint,Categorical Predictor, Classification
Subjects: Q Science > QA Mathematics > QA278.55 Cluster analysis
Q Science > QA Mathematics > QA76.9.D343 Data mining. Querying (Computer science)
T Technology > T Technology (General) > T57.5 Data Processing
T Technology > T Technology (General) > T57.62 Simulation
T Technology > T Technology (General) > T57.8 Nonlinear programming. Support vector machine. Wavelets. Hidden Markov models.
T Technology > T Technology (General) > T58.62 Decision support systems
Divisions: Faculty of Mathematics, Computation, and Data Science > Statistics > 49001-(S3) PhD Thesis
Depositing User: Yustanti Wiyli
Date Deposited: 15 Feb 2024 05:39
Last Modified: 15 Feb 2024 05:39
URI: http://repository.its.ac.id/id/eprint/107355

Actions (login required)

View Item View Item