Ginting, Agus Rasi Doanta (2025) Klasifikasi Kantuk Berdasarkan Eye Aspect Ratio (EAR) dan Mouth Aspect Ratio (MAR) Menggunakan Multivariate Long Short-Term Memory-Fully Convolutional Network. Other thesis, Institut Teknologi Sepuluh Nopember.
![]() |
Text
5024211018_Undergraduate_Thesis.pdf - Accepted Version Restricted to Repository staff only Download (14MB) | Request a copy |
Abstract
Kantuk merupakan salah satu faktor risiko signifikan dalam aktivitas yang memerlukan konsentrasi tinggi, seperti mengemudi dan pengoperasian alat berat. Penelitian ini mengusulkan pendekatan klasifikasi tingkat kantuk menggunakan dua indikator fisiologis wajah, yaitu Eye Aspect Ratio (EAR) dan Mouth Aspect Ratio (MAR), yang diekstrak dari citra video menggunakan MediaPipe Face Mesh. Kedua fitur ini dikombinasikan sebagai data deret waktu multivariat untuk menangkap dinamika ekspresi wajah secara temporal. Model klasifikasi yang digunakan mengadopsi arsitektur Multivariate Long Short-Term Memory Fully Convolutional Network (MLSTM-FCN), yang dirancang khusus untuk memproses data time series multivariat secara efektif. Evaluasi dilakukan menggunakan skema stratified 5-fold cross-validation pada tiga variasi panjang jendela (1800, 3600, dan 5400 frame). Hasil menunjukkan bahwa kombinasi fitur EAR dan MAR menghasilkan akurasi klasifikasi yang lebih tinggi dibandingkan penggunaan fitur EAR saja, sebagaimana ditunjukkan dalam penelitian sebelumnya. Pada level window, akurasi pengujian tertinggi dicapai sebesar 67,62% dengan window size 3600 frame. Sementara itu, pada level video, model yang dilatih menggunakan window size 1800 (1 menit) menghasilkan akurasi pengujian tertinggi sebesar 86,67%. Selain itu, model MLSTM-FCN menunjukkan performa yang lebih baik dibandingkan model LSTM standar pada seluruh konfigurasi.
==================================================================================================================================
Drowsiness is a significant risk factor in activities that require high levels of concentration, such as driving and operating heavy machinery. This study proposes a drowsiness level classifi- cation approach using two facial physiological indicators, namely the Eye Aspect Ratio (EAR) and Mouth Aspect Ratio (MAR), which are extracted from video frames using MediaPipe Face Mesh. These features are combined as multivariate time series data to capture the temporal dynamics of facial expressions. The classification model employs the Multivariate Long Short-Term Memory Fully Con- volutional Network (MLSTM-FCN) architecture, which is specifically designed to effectively process multivariate time series data. The evaluation is carried out using a stratified 5-fold cross-validation scheme with three different window sizes (1800, 3600, and 5400 frames). The results show that the combination of EAR and MAR features yields higher classification accu- racy compared to using only EAR, as demonstrated in previous studies. At the window level, the highest test accuracy of 66.67% was achieved using a window size of 3600 frames. Meanwhile, at the video level, the model trained with a window size of 1800 (1 minute) achieved the highest test accuracy of 86.67%. Additionally, the MLSTM-FCN model consistently outperformed the standard LSTM model across all configurations.
Item Type: | Thesis (Other) |
---|---|
Uncontrolled Keywords: | Drowsiness Recognition, Eye Aspect Ratio (EAR), Mouth Aspect Ratio (MAR), Multivariate Long Short-Term Memory Fully Convolutional Network (MLSTM-FCN) |
Subjects: | Q Science R Medicine > R Medicine (General) > R858 Deep Learning |
Divisions: | Faculty of Electrical Technology > Computer Engineering > 90243-(S1) Undergraduate Thesis |
Depositing User: | Agus Rasi Doanta Ginting |
Date Deposited: | 28 Jul 2025 09:02 |
Last Modified: | 28 Jul 2025 09:02 |
URI: | http://repository.its.ac.id/id/eprint/121949 |
Actions (login required)
![]() |
View Item |