Penyederhanaan Leksikal Menggunakan Multilingual Controllable Inverted Transformer-based Lexical Simplification

Kendenan, Orlantha (2024) Penyederhanaan Leksikal Menggunakan Multilingual Controllable Inverted Transformer-based Lexical Simplification. Other thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 5002201097-Undergraduate_Thesis.pdf] Text
5002201097-Undergraduate_Thesis.pdf - Accepted Version
Restricted to Repository staff only until 1 October 2026.

Download (8MB) | Request a copy

Abstract

Dengan hadirnya Large Language Models (LLMs), proses penyederhanaan leksikal atau lexical simplification menjadi salah satu topik hangat yang banyak dikembangkan dikarenakan tingkat kesulitan yang tinggi. Seringkali LLMs masih berhalusinasi dengan mengganggap kata umum sebagai kata sulit yang dapat diubah dengan kata lain yang mirip atau dijabarkan dengan definisinya. Oleh karena itu, penelitian terkait berusaha untuk mengidentifikasi beberapa faktor penting dalam melakukan penyederhanaan leksikal. Salah satunya adalah identifikasi kata yang sulit. Penelitian Tugas Akhir ini bertujuan untuk mengusulkan sebuah model LLMs yang diharapkan mampu untuk mengidentifikasi kata sulit lebih baik dalam proses lexical simplification. Penelitian Tugas Akhir ini menerapkan konsep inverse attention pada Multilingual Controllable Tranformer-based Lexical Simplification yang kemudian disingkat sebagai imTLS. Model yang diusulkan diuji pada sebuah dataset publik lexical simplification. Hasil eksperimen menunjukkan bahwa model imTLS berhasil melakukan penyederhanaan leksikal lebih baik dibandingkan model pembanding lainnya dengan ditunjukkan melalui metrik ACC@1, ACC@K@Top1, Potential@K, dan MAP@K.
======================================
Large Language Models (LLMs) let lexical simplification become one hot topic to be widely developed due to its challenges. One typical difficulty is hallucination by often misrecognizing common words as difficult terms. Accordingly, the related studies attempted to identify a few important keys for simplification. One major key is to identify difficult or complex words. This study aims to propose an LLMs that could identify the complex word precisely during lexical simplification. The proposed model is the application of inverse attention concept to Multilingual Controllable Transformer-based Lexical Simplification, so-called imTLS. In this study, the proposed model is tested on a public dataset for lexical simplification. The experiment results show that imTLS succeeds in performing lexical simplification better than other comparison models as demonstrated by the metrics ACC@1, ACC@K@Top1, Potential@K, and MAP@K.

Item Type: Thesis (Other)
Uncontrolled Keywords: Penyederhanaan leksikal, Large Language Models, Multilingual lexical simplification, Inverse Attention
Subjects: Q Science > QA Mathematics
Q Science > QA Mathematics > QA336 Artificial Intelligence
Q Science > QA Mathematics > QA76.6 Computer programming.
Q Science > QA Mathematics > QA76.87 Neural networks (Computer Science)
Divisions: Faculty of Science and Data Analytics (SCIENTICS) > Mathematics > 44201-(S1) Undergraduate Thesis
Depositing User: Orlantha Kendenan
Date Deposited: 07 Aug 2024 17:00
Last Modified: 07 Aug 2024 17:00
URI: http://repository.its.ac.id/id/eprint/113727

Actions (login required)

View Item View Item