Implementasi Metode Generative Pre-Trained Transformer Pada Dataset Teks Bilingual Inggris-Indonesia

Dinasty, Dinasty (2025) Implementasi Metode Generative Pre-Trained Transformer Pada Dataset Teks Bilingual Inggris-Indonesia. Other thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 06111940000026-Undergraduate_Thesis.pdf] Text
06111940000026-Undergraduate_Thesis.pdf - Accepted Version
Restricted to Repository staff only

Download (3MB) | Request a copy

Abstract

Penerjemahan otomatis (Neural Machine Translation/NMT) merupakan bidang krusial dalam pemrosesan bahasa alami (NLP) dengan tantangan unik untuk bahasa-bahasa resource-limited seperti Indonesia. Penelitian ini mengembangkan dan mengevaluasi model Generative Pre-trained Transformer (GPT) untuk tugas penerjemahan bahasa Indonesia-Inggris, mengisi celah penelitian pada pasangan bahasa yang kurang tereksplorasi ini. Model GPT (berbasis arsitektur GPT-2) yang telah dilatih sebelumnya (pre-trained) pada korpus multibahasa, diadaptasi melalui fine-tuning intensif menggunakan dataset bilingual khusus domain publik dari sumber beragam seperti dokumen pemerintah, artikel akademik, dan media daring. Evaluasi komprehensif dilakukan dengan membandingkan keluaran model terhadap terjemahan manusia profesional menggunakan metrik METEOR (Metric for Evaluation of Translation with Explicit ORdering) dan BLEU (Bilingual Evaluation Understudy). Hasil penelitian menunjukkan bahwa meskipun kualitas terjemahan belum sepenuhnya menyamai terjemahan manusia, pendekatan ini tetap menunjukkan kinerja yang menjanjikan dan relevan untuk diterapkan dalam berbagai aplikasi penerjemahan bahasa di dunia nyata.
========================================================================================================================================
Machine translation (Neural Machine Translation/NMT) represents a crucial field in natural language processing (NLP) with unique challenges for resource-limited languages such as Indonesian. This study develops and evaluates a Generative Pre-trained Transformer (GPT) model for Indonesian-to-English translation tasks, addressing the research gap in this under-explored language pair. The GPT model (based on GPT-2 architecture), pre-trained on multilingual corpora, was adapted through intensive fine tuning using a domain-specific public bilingual dataset from diverse sources including government documents, academic articles, and online media. A comprehensive evaluation was conducted by comparing the model’s output against professional human translations using METEOR (Metric for Evaluation of Translation with Explicit ORdering) and BLEU(Bilingual Evaluation Understudy) metrics. Results demonstrate that while translation quality does not yet fully match human translation, this approach still shows promising performance and remains relevant for implementation in various real-world language translation applications.

Item Type: Thesis (Other)
Uncontrolled Keywords: GPT (Generative Pre-trained Transformer), Penerjemahan Mesin, Bahasa Indonesia, Natural Language Processing (NLP), METEOR (Metric for Evaluation of Translation with Explicit ORdering), Neural Machine Translation (NMT). GPT (Generative Pre-trained Transformer), Machine Translation, Indonesian Language, Natural Language Processing (NLP), METEOR, Neural Machine Translation (NMT).
Subjects: Q Science > QA Mathematics > QA76.87 Neural networks (Computer Science)
Divisions: Faculty of Science and Data Analytics (SCIENTICS) > Mathematics > 44201-(S1) Undergraduate Thesis
Depositing User: Dinasty Dinasty
Date Deposited: 05 Aug 2025 01:49
Last Modified: 05 Aug 2025 01:49
URI: http://repository.its.ac.id/id/eprint/127256

Actions (login required)

View Item View Item