Sain, Hartayuni (2023) Fuzzy Support Vector Machine untuk Klasifikasi Data Time Series Multikelas Imbalanced. Doctoral thesis, Institut Teknologi Sepuluh November.
Text
06211560010001-Dissertation.pdf - Accepted Version Restricted to Repository staff only until 1 April 2026. Download (4MB) | Request a copy |
Abstract
Support vector machine (SVM) telah menjadi salah satu metode paling berkembang yang digunakan untuk metode klasifikasi yang fokus pada analisis cross-sectional. Namun, klasifikasi data time series merupakan isu penting dalam statistik dan penambangan data. Klasifikasi data time series menggunakan SVM yang fokus pada data cross-sectional dapat menyebabkan klasifikasi yang tidak tepat, sehingga SVM perlu diperluas untuk menangani data time series. Seperti halnya data cross-section, masalah ketidakseimbangan data juga umum terjadi pada data time series. Metode fuzzy terbukti mampu mengatasi kasus ketidakseimbangan data. Penelitian ini mengembangkan model Fuzzy Support Vector Machine (FSVM) untuk mengklasifikasikan data time series dengan kelas yang tidak seimbang (imbalanced). Metode yang diusulkan menempatkan fungsi keanggotaan fuzzy pada fungsi kendala. Tujuan penelitian ini adalah mendapatkan penaksiran parameter untuk metode FSVM yang dikembangkan, mengevaluasi kinerja FSVM yang dikembangkan dalam mengklasifikasikan data time series melalui studi simulasi dengan beberapa skenario ketidakseimbangan proporsi kelas respon, serta menerapkan FSVM untuk klasifikasi data curah hujan bulanan di 3 stasiun meteorologi di Indonesia yaitu Stasiun Meteorologi Kelas II Mutiara Sis Al Jufrie Palu Sulawesi Tengah, Stasiun Meteorologi Kelas II Ahmad Yani Semarang Jawa Tengah dan Stasiun Geofisika Kals II Sanglah Denpasar Bali. Evaluasi performansi model klasifikasi dilakukan dengan membandingkan nilai G-Mean, F-Measure, dan kurva Receiver Operating Characteristic (ROC) menunjukkan bahwa nilai Area Under Curve (AUC). Berdasarkan nilai akurasi klasifikasi dan sensitivity, dapat dibuktikan bahwa metode FSVM yang dikembangkan lebih baik dibandingkan dengan klasifikasi dengan metode SVM.
=================================================================================================================================
Support vector machine (SVM) has become one of the most developed methods used for classification methods that focus on cross-sectional analysis. However, classification of time series data is an important issue in statistics and data mining. Classification of time series data using SVMs that focus on cross-sectional data can lead to improper classification, so SVMs need to be extended to handle time series data. As with cross-section data, the problem of data imbalance is also common in time series data. Fuzzy methods are proven to be able to overcome cases of data imbalance. This research develops a Fuzzy Support Vector Machine (FSVM) model to classify time series data with imbalanced classes. The proposed method places the fuzzy membership function on the constraint function. The purpose of this research is to obtain parameter estimation for the developed FSVM method, evaluate the performance of the developed FSVM in classifying time series data through simulation studies with several scenarios of imbalance in the proportion of response classes, and apply FSVM for the classification of monthly rainfall data at 3 meteorological stations in Indonesia namely Class II Mutiara Sis Al Jufrie Meteorological Station Palu Central Sulawesi, Class II Ahmad Yani Meteorological Station Semarang Central Java and Class II Sanglah Geophysical Station Denpasar Bali. Evaluation of the classification model performance is done by comparing the G-Mean value, F-Measure, and Receiver Operating Characteristic (ROC) curve showing that the Area Under Curve (AUC) value. Based on the classification accuracy and sensitivity values, it can be proven that the developed FSVM method is better than the classification with the SVM method.
Item Type: | Thesis (Doctoral) |
---|---|
Uncontrolled Keywords: | Fuzzy Support Vector Machine, time series, multikelas, imbalanced, Gaussian Elastic Metric Kernel, Nilai Area Under Curve (AUC), multiclass, Area Under Curve (AUC) value. |
Subjects: | Q Science > Q Science (General) > Q337.5 Pattern recognition systems Q Science > QA Mathematics > QA276 Mathematical statistics. Time-series analysis. Failure time data analysis. Survival analysis (Biometry) |
Divisions: | Faculty of Science and Data Analytics (SCIENTICS) > Statistics > 49001-(S3) PhD Thesis |
Depositing User: | Hartayuni Sain |
Date Deposited: | 09 Nov 2023 01:56 |
Last Modified: | 09 Nov 2023 01:56 |
URI: | http://repository.its.ac.id/id/eprint/105087 |
Actions (login required)
View Item |