Deteksi Suara Deepfake Menggunakan Model Deep Learning

Razzaq, Thalent Athalla (2025) Deteksi Suara Deepfake Menggunakan Model Deep Learning. Other thesis, Institut Teknologi Sepuluh Nopember.

There is a more recent version of this item available.

[thumbnail of 5025211101-Undergraduate _Thesis.pdf]

Text
5025211101-Undergraduate _Thesis.pdf
Restricted to Repository staff only
Download (9MB) | Request a copy

Abstract

Penelitian ini bertujuan untuk mengembangkan model deep learning yang efektif untuk mendeteksi suara deepfake, sebagai respons terhadap tantangan keamanan dan kepercayaan yang ditimbulkan oleh teknologi ini. Metode yang diusulkan menggunakan kerangka kerja berbasis Recurrent Neural Network (RNN), dengan arsitektur Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), dan varian bidirectional-nya. Model ini dilatih menggunakan fitur-fitur audio yang diekstraksi seperti Mel-frequency Cepstral Coefficients (MFCC), chromagram, dan fitur spektral lainnya untuk menangkap karakteristik temporal sinyal suara. Pengujian dilakukan menggunakan tiga dataset publik, yaitu DEEP-VOICE: DeepfFake Voice Recognition, The Fake-or-Real (FoR) Dataset, dan Hemg/Deepfakeaudio. Hasil penelitian ini menunjukkan dua temuan utama: model mencapai performa deteksi yang sangat tinggi dengan akurasi melebihi 99% pada skenario pengujian intra-dataset, namun mengalami penurunan performa yang drastis saat diuji pada data lintas domain akibat adanya domain shift. Setelah melalui proses hyperparameter tuning yang sistematis pada skenario data gabungan, performa optimal yang dicapai adalah akurasi sebesar 75,14%. Hasil ini mengonfirmasi bahwa tantangan utama bukanlah pada optimasi model, melainkan pada kemampuan generalisasi model lintas domain. Manfaat dari penelitian ini adalah dihasilkan sebuah sistem yang dapat diimplementasikan untuk meningkatkan keamanan di berbagai sektor seperti perbankan, verifikasi identitas, dan perlindungan privasi melalui deteksi manipulasi suara.
==================================================================================================================================
This study aims to develop an effective deep learning model for detecting deepfake voices, responding to the security and trust challenges posed by this technology. The proposed method utilizes a framework based on a Recurrent Neural Network (RNN), with Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and their bidirectional variants. The model is trained using extracted audio features such as Mel-frequency Cepstral Coefficients (MFCC), chromagram, and other spectral features to capture the temporal characteristics of the audio signal. Testing was conducted using three public datasets: DEEP-VOICE: DeepFake Voice Recognition, The Fake-or-Real (FoR) Dataset, and Hemg/Deepfakeaudio. The results show two main findings: the model achieves exceptionally high detection performance, with accuracy exceeding 99%, in intra-dataset testing scenarios, but experiences a drastic performance drop when tested on cross-domain data due to domain shift. After a systematic hyperparameter tuning process on a combined dataset, an optimal accuracy of 75.14% was achieved. These results confirm that the primary challenge lies not in model optimization, but in the model's ability to generalize across different data domains. The benefit of this research is the creation of a system that can be implemented to enhance security in sectors like banking, identity verification, and privacy protection by accurately detecting voice manipulation.

Item Type:	Thesis (Other)
Uncontrolled Keywords:	Chromagram, Deep Learning, Deteksi Deepfake, MFCC, Pemrosesan Sinyal Audio, Pengenalan Suara, Recurrent Neural Network, Audio Signal Processing, Chromagram, Deepfake Detection, Deep Learning, MFCC, Recurrent Neural Network, Speech Recognition
Subjects:	M Music and Books on Music > M Music Q Science > QA Mathematics > QA336 Artificial Intelligence T Technology > T Technology (General) > T57.5 Data Processing
Divisions:	Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Informatics Engineering > 55201-(S1) Undergraduate Thesis
Depositing User:	Thalent Athalla Razzaq
Date Deposited:	30 Jul 2025 09:38
Last Modified:	30 Jul 2025 09:38
URI:	http://repository.its.ac.id/id/eprint/122203

Available Versions of this Item

Deteksi Suara Deepfake Menggunakan Model Deep Learning. (deposited 30 Jul 2025 09:38) [Currently Displayed]
- Deteksi Suara Deepfake Menggunakan Model Deep Learning. (deposited 08 Aug 2025 03:24)

Actions (login required)

View Item