IMPLEMENTATION OF A WEB-BASED HATE SPEECH DETECTION IN INDONESIAN LANGUAGE USING CNN AND LSTM COMBINED METHOD

Kusumajaya, Muhammad Naufal (2024) IMPLEMENTATION OF A WEB-BASED HATE SPEECH DETECTION IN INDONESIAN LANGUAGE USING CNN AND LSTM COMBINED METHOD. Other thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 5026201121-Undergraduate_Thesis.pdf] Text
5026201121-Undergraduate_Thesis.pdf - Accepted Version
Restricted to Repository staff only until 1 October 2026.

Download (2MB) | Request a copy

Abstract

Perkembangan media sosial yang begitu pesat memungkinkan komunikasi untuk digunakan secara langsung antar pengguna dengan latar belakang dan karakteristik psikologis yang beraneka ragam. Perbedaan ini dapat meningkatkan kemungkinan terjadinya konflik dan munculnya ujaran kebencian. Dalam Tugas Akhir ini dibuat sebuah pendeteksi ujaran kebencian dalam bahasa Indonesia berbasis web menggunakan gabungan metode Convolutional Neural Network (CNN) dan Long Short-Term Memory (LSTM). Data yang digunakkan berasal dari platform media sosial X dalam rentang waktu 2017-2019. Dari keseluruhan data tersebut, 70% di antranya digunakan untuk pembangunan model dan 30% sisanya digunakkan untuk pengujian model. Arsitektur model CNN-LSTM terbaik diperoleh melalui proses eksperimen terhadap beberapa kombinasi nilai hyperparameter menggunakan mekanisme 10-fold cross validation. Hasil eksperimen menjujukkan bahwa model terbaik memberikan hasil akurasi, presisi, recall, dan f1-score beturut-turut sebesar 87,56%, 86,21%, 90,51%, dan 88,29%. Hasil kinerja dari model gabungan CNN-LSTM ini relatif lebih baik dibandingkan dengan model terbaik yang hanya menggunakan model CNN (hasil akurasi, presisi, recall, dan f1-score beturut-turut sebesar 85,12%, 84,14%, 86,32% dan 88,29%) atau hanya LSTM (hasil akurasi, presisi, recall, dan f1-score beturut-turut sebesar 84,23%, 83,11%, 85,21%, dan 84,31%). Selain itu, nilai recall yang lebih besar dari nilai presisi menunjukkan bahwa model dapat lebih meminimumkan terjadinya false negative daripada false positive. Hasil ini sejalan dengan tujuan dari alat pendeteksi ujaran kebencianm untuk meminimumkan salah prediksi terhadap kalimat yang nyata-nyata mengandung ujaran kebencian. Hasil implementasi detektor ujaran kebencian ini dapat diakses menggunakan smartphone maupun desktop melalui link di https://id-hs-detector.streamlit.app/.
=================================================================
Social media's rapid development enables direct communication between users with diverse backgrounds and psychological characteristics. This difference can increase the risk of conflict and the emergence of hate speech. This final project creates a web-based hate speech detector in Indonesian using a combination of convolutional neural network (CNN) and long short-term memory (LSTM) methods. We used data from the social media platform X, which covered the period from 2017 to 2019. We use 70% of the total data for model building, and the remaining 30% for model testing. In order to find the best CNN-LSTM model architecture, we use the 10-fold cross validation method to test different combinations of hyperparameter values. The experimental results show that the best model provides accuracy, precision, recall, and F1-score results of 87.56%, 86.21%, 90.51%, and 87.29%, respectively. The combined CNN-LSTM model outperforms the best model that solely utilizes the CNN model (accuracy, precision, recall, and f1-score results of 85.12%, 84.14%, 86.32%, and 88.29%, respectively), or the best model that uses only the LSTM model (accuracy, precision, recall, and f1-score results of 84.23%, 83.11%, 85.21%, and 84.31%, respectively). Furthermore, experimental results show that the recall value is greater than the precision value, indicating that the model can reduce the occurrence of false negatives rather than false positives. These findings align with the purpose of the hate speech detection tool, which is to minimize incorrect predictions of sentences that clearly contain hate speech. The implementation results of the hate speech detector can be accessed using a smartphone or desktop through the link at https://id-hs-detector.streamlit.app/.

Item Type: Thesis (Other)
Uncontrolled Keywords: deteksi otomatis, jaringan saraf tiruan, kecerdasan buatan, media sosial, ujaran kebencian
Subjects: T Technology > T Technology (General) > T58.62 Decision support systems
Divisions: Faculty of Information and Communication Technology > Information Systems > 57201-(S1) Undergraduate Thesis
Depositing User: Muhammad Naufal Kusumajaya
Date Deposited: 02 Aug 2024 06:23
Last Modified: 02 Aug 2024 06:23
URI: http://repository.its.ac.id/id/eprint/111420

Actions (login required)

View Item View Item