Fatahillah, Sulthanur Iman (2025) Perbandingan Tipe Ekstraksi Fitur Mel-Frequency Cepstral Coeffisients Dalam Mengklasifikasikan Suara Burung Menggunakan Retentive Network Meet Vision Transformer. Other thesis, Institut Teknologi Sepuluh Nopember.
![]() |
Text
5002201055_Undergraduate Thesis.pdf - Accepted Version Restricted to Repository staff only until 1 April 2027. Download (7MB) | Request a copy |
Abstract
Pengenalan burung melalui pengamatan visual dan suara penting untuk meningkatkan ekowisata serta memfasilitasi identifikasi berdasarkan karakteristik vokal uniknya. Teknologi pengolahan sinyal dan kecerdasan buatan membantu meningkatkan akurasi serta efisiensi pengenalan suara burung. Mel-Frequency Cepstral Coefficients (MFCC) adalah metode ekstraksi fitur yang umum digunakan dalam pengenalan suara, namun tipe MFCC yang digunakan dapat memengaruhi kinerja model. Salah satu model yang digunakan adalah Retentive Network Meet Vision Transformer (RMT), yang berasal dari Retentive Network (RetNet). Dengan matriks decay spasial, RMT memberikan prioritas spasial yang jelas pada data dua dimensi dan bidireksional. Penelitian ini membandingkan tipe ekstraksi fitur MFCC menggunakan model RMT untuk klasifikasi suara burung. Evaluasi dilakukan pada 10% dataset menggunakan Confusion Matrix. Dua dataset digunakan, yaitu Omkicau dan Xeno-Canto. Hasil menunjukkan MFCC standar memiliki akurasi tertinggi: 90,66% pada Omkicau dan 97,54% pada Xeno-Canto. Ekstraksi ∆MFCC memperoleh 88,33% pada Omkicau dan 95,27% pada Xeno-Canto, sementara ∆2MFCC mencapai 85,67% pada Omkicau dan 94,79% pada Xeno-Canto. Ketiga metode menunjukkan kemampuan generalisasi yang baik dalam mengenali spesies burung. MFCC standar memberikan performa terbaik, menjadikannya pilihan utama dalam klasifikasi suara burung menggunakan model RMT.
================================================================================================================================
Bird recognition through visual and auditory observation is essential for enhancing ecotourism and facilitating identification based on unique vocal characteristics. Signal processing technology and artificial intelligence help improve the accuracy and efficiency of bird sound recognition. Mel-Frequency Cepstral Coefficients (MFCC) is a commonly used feature extraction method in sound recognition; however, the type of MFCC used can affect the model’s performance. One of the models employed is the Retentive Network Meet Vision Transformer (RMT), derived from the Retentive Network (RetNet). Using a spatial decay matrix, RMT provides clear spatial prioritization for bidirectional twodimensional data. This study compares different types of MFCC feature extraction using the RMT model for bird sound classification. Evaluation was conducted on 10% of the dataset using a Confusion Matrix. Two datasets were used: Omkicau and Xeno-Canto. Results indicate that standard MFCC achieved the highest accuracy: 90.66% on Omkicau and 97.54% on Xeno-Canto. The ∆MFCC extraction obtained 88.33% on Omkicau and 95.27% on Xeno-Canto, while ∆2MFCC reached 85.67% on Omkicau and 94.79% on XenoCanto. All three methods demonstrated good generalization capability in recognizing bird species. Standard MFCC delivered the best performance, making it the preferred choice for bird sound classification using the RMT model.
Item Type: | Thesis (Other) |
---|---|
Uncontrolled Keywords: | Ekstraksi Fitur, MFCC, Retentive Network, Suara Burung, Vision Transformer Birdsong, Feature Extraction, MFCC, Retentive Network, Vision Transformer |
Subjects: | Q Science > QA Mathematics > QA403.3 Wavelets (Mathematics) Q Science > QA Mathematics > QA404 Fourier series Q Science > QH Biology > QH75 Nature conservation. Landscape protection. Biodiversity conservation. Endangered species and ecosystems (General). Habitat conservation. Ecosystem management. Conservation biology T Technology > T Technology (General) |
Divisions: | Faculty of Science and Data Analytics (SCIENTICS) > Mathematics > 44201-(S1) Undergraduate Thesis |
Depositing User: | SULTHANUR IMAN FATAHILLAH |
Date Deposited: | 04 Feb 2025 05:50 |
Last Modified: | 04 Feb 2025 05:50 |
URI: | http://repository.its.ac.id/id/eprint/118087 |
Actions (login required)
![]() |
View Item |