Perbandingan Performa Model SVM, MNB, RF, dan K-NN Dalam Analisis Sentimen YouTube Tentang Fenomena Childfree di Indonesia

Luthfiyah, Sausan Firdha (2025) Perbandingan Performa Model SVM, MNB, RF, dan K-NN Dalam Analisis Sentimen YouTube Tentang Fenomena Childfree di Indonesia. Other thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 5002211065_Undergraduate_Thesis.pdf] Text
5002211065_Undergraduate_Thesis.pdf - Accepted Version
Restricted to Repository staff only

Download (11MB) | Request a copy

Abstract

Fenomena childfree di Indonesia menimbulkan beragam opini yang dapat dianalisis melalui komentar publik di media sosial, khususnya YouTube. Penelitian ini bertujuan untuk membandingkan performa empat model klasifikasi yaitu Support Vector Machine (SVM), Multinomial Naïve Bayes (MNB), Random Forest (RF), dan K-Nearest Neighbors (K-NN) dalam analisis sentimen komentar mengenai childfree. Sebanyak 27.359 komentar dikumpulkan dari 35 video YouTube yang relevan dan diproses melalui tahap prapemrosesan, ekstraksi fitur menggunakan TF-IDF dan BM25 serta penyeimbangan data menggunakan SMOTE. Evaluasi dilakukan menggunakan metrik akurasi, presisi, recall, F1-score, dan ROC-AUC. Hasil menunjukkan bahwa SVM dengan pembobotan TF-IDF dan skema pembagian data 90:10 memberikan performa terbaik dengan akurasi 89% dan F1-score 76%. Model lain seperti RF, MNB, dan K-NN menunjukkan performa yang lebih rendah, di mana K-NN menjadi yang paling tidak optimal. Penelitian ini menunjukkan bahwa SVM mampu mengklasifikasikan sentimen secara seimbang dan konsisten serta dapat dijadikan acuan dalam pengembangan metode analisis opini publik berbasis teks di media sosial.
==================================================================================================================================
The childfree phenomenon in Indonesia has sparked diverse public opinions that can be analyzed through user comments on social media platforms, particularly YouTube. This study aims to compare the performance of four classification models, namely Support Vector Machine (SVM), Multinomial Naïve Bayes (MNB), Random Forest (RF), and K-Nearest Neighbors (K-NN), in sentiment analysis of comments related to the childfree issue. A total of 27,359 comments were collected from 35 relevant YouTube videos and processed through text preprocessing, feature extraction using TF-IDF and BM25, and data balancing using SMOTE. The evaluation was conducted using accuracy, precision, recall, F1-score, and ROC-AUC metrics. The results show that the SVM model with TF-IDF weighting and a 90:10 data split scheme achieved the best performance, with 89% accuracy and an F1-score of 76%. Other models such as RF, MNB, and K-NN demonstrated lower performance, with K-NN being the least effective. This study highlights that SVM is capable of classifying sentiment in a balanced and consistent manner, making it a strong candidate for future applications in public opinion analysis on social media platforms.

Item Type: Thesis (Other)
Uncontrolled Keywords: Analisis Sentimen, BM25, Machine Learning, SMOTE, TF-IDF. Sentiment Analysis, BM25, Machine Learning, SMOTE, TF-IDF.
Subjects: Q Science > QA Mathematics > QA76.6 Computer programming.
Divisions: Faculty of Mathematics, Computation, and Data Science > Mathematics > 44201-(S1) Undergraduate Thesis
Depositing User: Sausan Firdha Luthfiyah
Date Deposited: 28 Jul 2025 01:32
Last Modified: 28 Jul 2025 01:32
URI: http://repository.its.ac.id/id/eprint/121607

Actions (login required)

View Item View Item