Detecting Fake News In Indonesian News Articles Using Support Vector Machine And Random Forest Machine Learning

Ibrahim, Hilmy Hanif (2024) Detecting Fake News In Indonesian News Articles Using Support Vector Machine And Random Forest Machine Learning. Other thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 05111942000005-Undergraduate_Thesis.pdf]

Text
05111942000005-Undergraduate_Thesis.pdf - Accepted Version
Restricted to Repository staff only
Download (692kB) | Request a copy

Abstract

The proliferation of fake news, defined as the intentional dissemination of false or misleading information, has become a global concern in recent years (Rusli et al., 2020; Nayoga et al., 2021; Fawaid et al., 2021; Agudelo et al., 2018). In Indonesia, one of the world's most populous nations, this issue has reached alarming levels, with surveys indicating that 46% to 61% of rural Indonesians and 45.3% to 79.6% of urban Indonesians have been exposed to and believed fake news (Fawaid et al., 2021). The widespread circulation of misinformation poses serious risks, including the spread of false narratives, erosion of public trust in media and institutions, and potential social and political instability. To address this problem, this study explores the application of machine learning models for fake news detection in the Indonesian context. The primary objective is to evaluate the effectiveness of two widely used models—Support Vector Machines (SVM) and Random Forest—in identifying fake news within Indonesian datasets. The proposed solution aims to leverage these algorithms to address linguistic and contextual challenges specific to the Indonesian language, such as regional dialects, slang, and code-switching. The dataset used for this research was obtained from the Indonesian Hoax News Detection Dataset (Pratiwi et al., 2017), a publicly available labeled collection of fake and real news articles in Bahasa Indonesia. The methods employed include preprocessing textual data to extract meaningful features, such as TF-IDF, and training the SVM and Random Forest models on the dataset. The models were then evaluated based on accuracy and their ability to generalize to unseen data. The results indicated that both models achieved moderate success, with accuracy scores of 56.15% for SVM and 55.96% for Random Forest. While these outcomes highlight the potential of machine learning in tackling fake news, they also underscore the need for further optimization, including the integration of contextual language models like IndoBERT and the use of more extensive and diverse datasets. These findings contribute to the growing body of research aimed at combating misinformation in Indonesia and provide a foundation for future advancements in this critical area.

Item Type:	Thesis (Other)
Uncontrolled Keywords:	Fake news, Indonesia, Machine learning, Support Vector Machine (SVM), Random Forest
Subjects:	T Technology > T Technology (General) > T57.5 Data Processing
Divisions:	Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Informatics Engineering > 55201-(S1) Undergraduate Thesis
Depositing User:	Hilmy Hanif Ibrahim
Date Deposited:	06 Aug 2025 02:13
Last Modified:	06 Aug 2025 02:13
URI:	http://repository.its.ac.id/id/eprint/125154

Actions (login required)

View Item