Hasiholan, Gilbert Immanuel (2025) Optimalisasi Kualitas Konten Dan Pemberian Umpan Balik Kontekstual Dalam Esai Bahasa Indonesia Menggunakan Optical Character Recognition (OCR) Dan Large Language Models (LLM). Other thesis, Institut Teknologi Sepuluh Nopember.
![]() |
Text
5027211056-Undergraduate_Thesis.pdf - Accepted Version Restricted to Repository staff only Download (3MB) | Request a copy |
Abstract
Penelitian ini mengembangkan sistem penilaian esai otomatis untuk mengatasi inefisiensi evaluasi manual yang memakan waktu dan rentan terhadap subjektivitas. Sistem yang dibangun mengintegrasikan Optical Character Recognition (OCR) berbasis TrOCR dengan Large Language Model (LLM) untuk menganalisis esai tulisan tangan Bahasa Indonesia. Model TrOCR di-fine-tune menggunakan dataset GoodNotes Handwriting Kollection (GNHK) yang dikombinasikan dengan data sintetis Bahasa Indonesia dalam berbagai varian (3k, 6k, 10k, dan 30k sampel). Sistem menerapkan pipeline end-to-end yang mencakup deteksi region teks, pengenalan karakter, koreksi teks multi-level menggunakan SymSpell dan LLM, serta penilaian esai berdasarkan kriteria tata bahasa, struktur argumen, dan koherensi. Evaluasi komparatif dengan EasyOCR, TesseractOCR, dan model ViT-IndoBERT menunjukkan bahwa model trocr_30k mencapai Character Error Rate (CER) terbaik sebesar 0,124 dan Character F1-score 0,909, sementara EasyOCR unggul dalam Word F1-score (0,956) dan kecepatan inferensi (13,73 detik). Hasil penelitian ini menunjukkan bahwa keunggulan pada akurasi karakter tidak selalu menjamin keunggulan pada akurasi kata, di mana pemilihan model optimal bergantung pada sensitivitas LLM hilir terhadap jenis kesalahan OCR. Sistem yang dikembangkan berhasil memberikan umpan balik kontekstual dan penilaian objektif, mendukung efisiensi evaluasi esai dalam lingkungan pendidikan.
============================================================================================================================================
This research develops an automated essay assessment system to address the inefficiency of manual evaluation that is time-consuming and prone to subjectivity. The developed system integrates TrOCR-based Optical Character Recognition (OCR) with Large Language Models (LLM) to analyze Indonesian handwritten essays. The TrOCR model was fine-tuned using the
GoodNotes Handwriting Kollection (GNHK) dataset combined with synthetic Indonesian data in various configurations (3k, 6k, 10k, and 30k samples). The system implements an end-to�end pipeline encompassing text region detection, character recognition, multi-level text correction using SymSpell and LLM, and essay assessment based on grammar, argument structure, and coherence criteria. Comparative evaluation with EasyOCR, TesseractOCR, and ViT-IndoBERT models demonstrates that the trocr_30k model achieves the best Character Error Rate (CER) of 0.124 and Character F1-score of 0.909, while EasyOCR excels in Word F1-score (0.956) and inference speed (13.73 seconds). The research reveals that superiority in character accuracy does not always equate to superiority in word accuracy, making the optimal model selection dependent on the downstream LLM's sensitivity to different types of OCR errors. The developed system successfully provides contextual feedback and objective assessment, supporting efficient essay evaluation in educational environments.
Item Type: | Thesis (Other) |
---|---|
Uncontrolled Keywords: | Large Language Models, Optical Character Recognition, TrOCR, Esai, Bahasa Indonesia, Fine-tuning, Large Language Models, Optical Character Recognition, TrOCR, Essay, Bahasa Indonesia, Fine-tuning |
Subjects: | T Technology > T Technology (General) > T57.84 Heuristic algorithms. T Technology > T Technology (General) > T58.6 Management information systems |
Divisions: | Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Information Technology > 59201-(S1) Undergraduate Thesis |
Depositing User: | Gilbert Immanuel Hasiholan |
Date Deposited: | 21 Jul 2025 02:46 |
Last Modified: | 21 Jul 2025 02:46 |
URI: | http://repository.its.ac.id/id/eprint/120192 |
Actions (login required)
![]() |
View Item |