Pembuatan Trace Link Antara Issue dan Commit dengan Transfer Learning dan Explainable Artificial Intelligence

Puspa, Hanun Shaka (2025) Pembuatan Trace Link Antara Issue dan Commit dengan Transfer Learning dan Explainable Artificial Intelligence. Other thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 5025211051-Undergraduate_thesis.pdf] Text
5025211051-Undergraduate_thesis.pdf - Accepted Version
Restricted to Repository staff only

Download (12MB) | Request a copy

Abstract

Traceability merupakan aspek penting dalam pengembangan perangkat lunak karena memberikan banyak manfaat, contohnya melacak perubahan pada perangkat lunak. Pada umumnya, traceability dilakukan pada pasangan dokumen kebutuhan pengguna berupa issue dan kode sumber berupa commit. Pada kenyataannya, traceability tidak mudah diimplementasikan karena dilakukan secara manual dan cenderung dilalaikan oleh pengembang perangkat lunak. Pendekatan transfer learning fine-tuning digunakan untuk membuat model untuk tugas klasifikasi biner ada atau tidak adanya trace link antara pasangan issue dan commit. Model tersebut dibangun dari model pretrained RoBERTa, BERT, AlBERT, dan DistilBERT. Fine tuning dilakukan dengan teknik full fine-tuning, adapter-based tuning, Low Rank Adaptation (LoRA), dan prefix tuning. Dataset yang digunakan adalah dataset LinkFormer, dataset 20-MAD, dan riwayat issue-commit dari proyek yang pernah dikerjakan oleh penulis. Pelatihan dilakukan dengan dua skenario pemisahan dataset: pemisahan secara acak dan pemisahan dengan memperhatikan urutan waktu authored. Metode Local Interpretable Model-Agnostic Explanations (LIME) dan Shapley Additive Explanation (SHAP) dimanfaatkan untuk memberikan penjelasan pada sampel hasil klasifikasi oleh model terbaik dari proses fine-tuning. Hasil fine-tuning menunjukkan bahwa model terbaik adalah RoBERTa yang dilatih dengan teknik full fine-tuning dengan f1-score 96,4% pada skenario pemisahan dataset acak. Metode LIME menunjukkan bahwa fitur yang paling berpengaruh dalam klasifikasi adalah summary dan commit message. Metode SHAP menunjukkan bahwa fitur yang paling berpengaruh dalam klasifikasi adalah summary, commit message, dan diff.
======================================================================================================================================
Traceability is an important aspect of software development as it provides many benefits, such as tracking changes to the software. In general, traceability is performed on pairs of user requirements documents in the form of issues and source code in the form of commits. In reality, traceability is not easy to implement because it is done manually and tends to be neglected by software developers. A fine-tuning transfer learning approach is used to create a model for the binary classification task of the presence or absence of trace links between issue and commit pairs. The model is built from the pretrained models RoBERTa, BERT, AlBERT, and DistilBERT. Fine-tuning is done with full fine-tuning, adapter-based tuning, Low Rank Adaptation (LoRA), and prefix tuning techniques. The datasets used are the LinkFormer dataset, the 20-MAD dataset, and the issue-commit history from projects that the author has worked on. Training was conducted with two dataset splitting scenarios: random splitting and splitting with respect to authored time order. Local Interpretable Model-Agnostic Explanations (LIME) and Shapley Additive Explanation (SHAP) methods were utilized to provide explanations for the sample classification results by the best model from the fine-tuning process. Fine-tuning results show that the best model is RoBERTa trained with the full fine-tuning technique with an f1-score of 96.4% in the random dataset split scenario. The LIME method shows that the most influential features in classification are summary and commit message. The SHAP method shows that the most influential features in classification are summary, commit message, and diff.

Item Type: Thesis (Other)
Uncontrolled Keywords: Commit, explainable artificial intelligence, issue, LLM4SE, traceability
Subjects: T Technology > T Technology (General)
T Technology > T Technology (General) > T58.6 Management information systems
Divisions: Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Informatics Engineering > 55201-(S1) Undergraduate Thesis
Depositing User: Hanun Shaka Puspa
Date Deposited: 24 Jul 2025 08:46
Last Modified: 24 Jul 2025 08:46
URI: http://repository.its.ac.id/id/eprint/120940

Actions (login required)

View Item View Item