Akmal, Fadhl and Ma'ruf, Muhammad Rifqi (2025) Studi Komparatif Skenario Fine-Tuning Model Transformer untuk Analisis Sentimen Teks Twitter Berbahasa Indonesia. Project Report. [s.n], [s.l.]. (Unpublished)
![]() |
Text
5025221028_5025221060-Project_Report.pdf - Accepted Version Download (1MB) |
Abstract
Pertumbuhan konten digital berbahasa Indonesia yang pesat menuntut adanya sistem analisis sentimen yang akurat dan efisien. Model transformer berbasis transfer learning telah menunjukkan potensi besar, namun strategi fine-tuning yang paling efektif untuk konteks bahasa Indonesia masih memerlukan eksplorasi lebih lanjut. Penelitian ini melakukan studi komparatif untuk mengevaluasi tiga skenario fine-tuning—Fine-Tuning Standar, Gradual Unfreezing, dan Differential Learning Rates—pada tiga arsitektur model: IndoBERTbase, IndoBERTweet, dan RoBERTa. Pengujian dilakukan pada dua dataset dengan domain berbeda, yaitu ulasan aplikasi (Dataset BBM) dan komentar politik (Dataset Pemilu), dengan metrik evaluasi utama F1-Score. Hasil penelitian menunjukkan bahwa IndoBERTweet secara konsisten menjadi model dengan performa tertinggi di semua skenario, mencapai F1-Score puncak 0.9218 pada Dataset BBM dan 0.7431 pada Dataset Pemilu. Temuan kunci lainnya adalah strategi Fine-Tuning Standar dengan learning rate yang teroptimasi terbukti lebih unggul dibandingkan dua teknik lanjutan yang lebih kompleks. Penelitian ini menyimpulkan bahwa keselarasan domain antara data pre-training model dengan data tugas akhir merupakan factor krusial untuk mencapai performa tinggi, dan sebuah metode yang lebih sederhana namun teroptimasi dengan baik dapat lebih efektif daripada strategi yang kompleks.
====================================================================================================================================
The rapid growth of Indonesian digital content necessitates an accurate and efficient sentiment analysis system. Transfer learning-based transformer models have shown great potential, yet the most effective fine-tuning strategies for the Indonesian context require further exploration. This research conducts a comparative study to evaluate three fine-tuning scenarios—Standard Fine-Tuning, Gradual Unfreezing, and Differential Learning Rates—on three model architectures: IndoBERTbase, IndoBERTweet, and RoBERTa. The models were tested on two datasets with different domains: application reviews (BBM Dataset) and political comments (Pemilu Dataset), using the F1-Score as the primary evaluation metric. The results consistently show that IndoBERTweet achieved the highest performance across all scenarios, reaching a peak F1-Score of 0.9218 on the BBM Dataset and 0.7431 on the Pemilu Dataset. Another key finding is that the Standard Fine-Tuning strategy with an optimized learning rate proved superior to the two more complex advanced techniques. This study concludes that domain alignment between the model's pre-training data and the final task data is a crucial factor for achieving high performance. Furthermore, a simpler yet well-optimized method can be more effective than complex strategies.
Item Type: | Monograph (Project Report) |
---|---|
Uncontrolled Keywords: | Analisis Sentimen, Transfer Learning, Fine-Tuning, BERT, IndoBERT, IndoBERTweet, RoBERTa, Pemrosesan Bahasa Alami, Bahasa Indonesia Sentiment Analysis, Transfer Learning, Fine-Tuning, BERT, IndoBERT, IndoBERTweet, RoBERTa, Natural Language Processing, Indonesian Language. |
Subjects: | T Technology > T Technology (General) > T57.5 Data Processing |
Divisions: | Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Informatics Engineering > 55201-(S1) Undergraduate Thesis |
Depositing User: | Muhammad Rifqi Ma'ruf |
Date Deposited: | 23 Jul 2025 01:30 |
Last Modified: | 23 Jul 2025 01:30 |
URI: | http://repository.its.ac.id/id/eprint/120632 |
Actions (login required)
![]() |
View Item |