Retrieval-Augmented Generation (RAG) Dengan Large Language Model Bahasa Indonesia Untuk Question Answering

Tampubolon, Andrew Lomaksan Manuel (2025) Retrieval-Augmented Generation (RAG) Dengan Large Language Model Bahasa Indonesia Untuk Question Answering. Masters thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 6025231025-Master_Thesis.pdf] Text
6025231025-Master_Thesis.pdf - Accepted Version
Restricted to Repository staff only

Download (4MB) | Request a copy

Abstract

Large Languange Model (LLM) adalah teknologi kecerdasan artifisial yang memiliki prospek kedepan yang menjanjikan untuk diteliti. LLM bermanfaat dalam pemrosesan informasi berbasis teks, termasuk dalam tugas Question Answering (QA). Saat ini LLM juga dikembangkan untuk low-resource language Bahasa Indonesia. QA dapat dibagi menjadi Open Domain Question Answering (ODQA) dan Domain Specific Question Answering (DSQA). Namun dalam penerapannya terdapat beberapa permasalahan seperti fine-tuning dan hallucination. Proses finetuning membutuhkan sumberdaya dan komputasi yang besar. Disisi lain, hallucination menunjukkan bahwa output dari LLM tidak aktual dan tidak relevan. Penelitian ini mengusulkan pendekatan Retrieval-Augmented Generation (RAG) untuk mengatasi permasalahan dari penerapan LLM. Penelitian ini juga menggabungkan metode RAG dengan Synergizing Reasoning and Acting (ReAct). Metode tersebut dapat membantu model dalam melakukan penalaran. Dalam penelitian ini, digunakan unsur low-resource language Bahasa Indonesia pada dataset dan LLM yang digunakan. Untuk mengoptimalkan komponen retriever, dilakukan proses hyperparameter tuning. Metode grid search digunakan untuk menemukan parameter chunk size dan chunk overlap terbaik dalam proses context windowing. Model Dense Passage Retriever, All-MPNet-Base-v2, dan BM25 digunakan sebagai komponen retriever. Secara generatif, dilakukan perbandingan terhadap LLM yang di fine-tuning dengan Bahasa Indonesia dan base model-nya. Hasil dari penelitian menunjukkan bahwa konfigurasi LLM Gemma-2-7B, retriever BM25 dan ReAct adalah yang terbaik untuk dataset simple dan complex (ODQA) dan COVID-19 (DSQA). Dalam banyak skenario, retriever BM25 dapat membantu dalam menghasilkan konteks terbaik berdasarkan metrik evaluasi. ReAct dapat membantu model dalam keluarga Gemma-2-7B dalam menghasilkan jawaban melalui proses penalaran. Eksperimen juga menunjukkan bahwa base model memiliki kinerja yang lebih baik ketimbang LLM yang dilatih dengan Bahasa Indonesia seperti SahabatAI-Gemma-2-9B dan SahabatAI-Llama-3-8B dalam menjawab pertanyaan.
=============================================================================================================================================
Large Language Models (LLMs) are artificial intelligence technologies with promising future prospects for research. LLMs are beneficial in processing textbased information, including tasks such as Question Answering (QA). Currently, LLMs are also being developed for low-resource languages such as Bahasa Indonesia. QA can be divided into Open-Domain Question Answering (ODQA) and Domain-Specific Question Answering (DSQA). However, several challenges arise in their implementation, such as fine-tuning and hallucination. The fine-tuning process requires substantial resources and computational power. On the other hand, hallucination indicates that LLM outputs may be inaccurate and irrelevant. This study proposes the Retrieval-Augmented Generation (RAG) approach to address these challenges in LLM implementation. It also integrates RAG with the Synergizing Reasoning and Acting (ReAct) method, which can assist the model in reasoning tasks. The study employs elements of the low-resource language Bahasa Indonesia in both the dataset and the LLMs used. To optimize the retriever component, hyperparameter tuning is performed. The grid search method is used to determine the optimal values for chunk size and chunk overlap in the context windowing process. Dense Passage Retriever, All-MPNet-Base-v2, and BM25 are utilized as retriever components. On the generative side, comparisons are made between LLMs fine-tuned in Bahasa Indonesia and their respective base models. The results of the study indicate that the configuration of Gemma-2-9B LLM, BM25 retriever, and ReAct yields the best performance for both simple and complex ODQA datasets, as well as the COVID-19 DSQA dataset. In many scenarios, the BM25 retriever helps provide the most relevant context based on evaluation metrics. ReAct enhances the reasoning ability of models in the Gemma-2-9B family when generating answers. Experiments also demonstrate that base models outperform Indonesian fine-tuned LLMs such as SahabatAI-Gemma-2-9B and SahabatAI-Llama-3-8B in answering questions.

Item Type: Thesis (Masters)
Uncontrolled Keywords: Bahasa Indonesia, Large Language Model, Prompt Engineering, Question Answering, Synergizing Reasoning and Acting, Retrieval-Augmented Generation
Subjects: A General Works > AI Indexes (General)
A General Works > AI Indexes (General)
T Technology > T Technology (General) > T58.6 Management information systems
T Technology > T Technology (General) > T58.64 Information resources management
Divisions: Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Informatics Engineering > 55101-(S2) Master Thesis
Depositing User: Andrew Lomaksan Manuel Tampubolon
Date Deposited: 05 Aug 2025 04:33
Last Modified: 05 Aug 2025 04:33
URI: http://repository.its.ac.id/id/eprint/125192

Actions (login required)

View Item View Item