Romadhona, Nathania Cahya (2026) Evaluasi Performa Ekstraksi Fitur Mel dan Gammatone Frequency Cepstral Coefficients Menggunakan Model Deep Learning untuk Klasifikasi Tuberkulosis Berdasarkan Suara Batuk. Other thesis, Institut Teknologi Sepuluh Nopember.
|
Text
5049221016-Undergraduate_Thesis.pdf - Accepted Version Restricted to Repository staff only Download (5MB) | Request a copy |
Abstract
Tuberkulosis (TB) masih menjadi salah satu tantangan kesehatan global utama, di mana metode diagnosis konvensional seringkali terbatas oleh biaya, waktu, dan aksesibilitas. Penelitian ini bertujuan mengevaluasi performa dan konfigurasi optimal fitur Mel-Frequency Cepstral Coefficients (MFCC) dan Gammatone Frequency Cepstral Coefficients (GFCC), serta membandingkan kinerja model Convolutional Neural Network (CNN) dan Long Short-Term Memory (LSTM) untuk klasifikasi biner batuk TB dan non-TB dengan mempertimbangkan variasi parameter ekstraksi fitur dan perbedaan perangkat perekaman. Metodologi mencakup pengumpulan dua jenis dataset, yaitu data sekunder CIRDZ sebagai data pelatihan dan pengujian awal, serta data primer yang direkam secara klinis di Poli DOTS TB RSUA menggunakan mikrofon INMP441S. Tahapan pemrosesan meliputi segmentasi batuk dengan Voice Activity Detection, validasi segmen menggunakan model YAMNet, kontrol kualitas akustik, ekstraksi fitur MFCC dan GFCC dengan variasi kanal filterbank dan jumlah koefisien DCT, augmentasi berbasis SpecAugment, pelatihan model CNN dan LSTM, serta evaluasi menggunakan akurasi, presisi, sensitivitas, F1-score, dan Area Under Curve (AUC). Hasil menunjukkan bahwa pada data sekunder, konfigurasi MFCC 32 kanal dengan 39 koefisien pada model LSTM dan beberapa kombinasi GFCC mampu mencapai akurasi hingga 96% dengan AUC mendekati 1, sedangkan pengujian pada data primer menghasilkan akurasi terbaik sekitar 75% pada model CNN dengan MFCC 40 kanal dan 13 koefisien, yang menegaskan adanya tantangan perbedaan domain dan kualitas perekaman. Secara keseluruhan, penelitian ini menyimpulkan bahwa pemilihan konfigurasi fitur dan arsitektur deep learning yang tepat berperan penting dalam meningkatkan performa dan potensi implementasi sistem pra-skrining TB berbasis suara batuk di lingkungan klinis nyata.
=================================================================================================================================
Tuberculosis (TB) remains a major global health challenge, with conventional diagnostic methods often limited by cost, time, and accessibility. This study aims to evaluate the performance and optimal configuration of Mel-Frequency Cepstral Coefficients (MFCC) and Gammatone Frequency Cepstral Coefficients (GFCC) features, and compare the performance of Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) models for binary classification of TB and non-TB coughs by considering variations in feature extraction parameters and differences in recording devices. The methodology includes collecting two types of datasets, namely secondary CIRDZ data as initial training and testing data, and primary data recorded clinically at the RSUA TB DOTS Clinic using an INMP441S microphone. The processing stages include cough segmentation with Voice Activity Detection, segment validation using the YAMNet model, acoustic quality control, MFCC and GFCC feature extraction with varying filterbank channels and the number of DCT coefficients, SpecAugment-based augmentation, CNN and LSTM model training, and evaluation using accuracy, precision, sensitivity, F1-score, and Area Under Curve (AUC). The results show that on secondary data, a 32-channel MFCC configuration with 39 coefficients in the LSTM model and several combinations of GFCCs are able to achieve an accuracy of up to 96% with an AUC close to 1, while testing on primary data produces the best accuracy of around 75% in the CNN model with 40-channel MFCC and 13 coefficients, which emphasizes the challenges of domain differences and recording quality. Overall, this study concludes that selecting the right feature configuration and deep learning architecture plays a crucial role in improving the performance and potential implementation of a cough-based TB pre-screening system in a real clinical setting.
| Item Type: | Thesis (Other) |
|---|---|
| Uncontrolled Keywords: | Tuberkulosis, Mel-Frequency Cepstral Coefficients, Gammatone Frequency Cepstral Coefficients, Klasifikasi, Deep learning. Tuberculosis, Mel-Frequency Cepstral Coefficients, Gammatone Frequency Cepstral Coefficients, Classification, Deep learning. |
| Subjects: | R Medicine > R Medicine (General) > R858 Deep Learning |
| Divisions: | Faculty of medicine and health (MEDICS) > Medical Technology > 11503-(S1) Undergraduate Thesis |
| Depositing User: | Nathania Cahya Romadhona |
| Date Deposited: | 03 Feb 2026 02:30 |
| Last Modified: | 03 Feb 2026 02:30 |
| URI: | http://repository.its.ac.id/id/eprint/130706 |
Actions (login required)
![]() |
View Item |
