Pelabelan dan Pembuatan Model Image Captioning Menggunakan Deep Learning

Safitri, Wardatul Amalia and Arsyad, Hammuda (2025) Pelabelan dan Pembuatan Model Image Captioning Menggunakan Deep Learning. Project Report. [s.n.], [s.l.]. (Unpublished)

[thumbnail of 5025211006_5025211146-Project_Report.pdf] Text
5025211006_5025211146-Project_Report.pdf - Accepted Version

Download (1MB)

Abstract

Kerja praktik ini dilakukan di Laboratorium Komputasi Cerdas dan Visi (KCV) Departemen Teknik Informatika Institut Teknologi Sepuluh Nopember (ITS). Kegiatan yang dilakukan memiliki tujuan untuk menghasilkan model image captioning dengan memanfaatkan deep learning. Kegiatan kerja praktik dimulai dengan membuat dataset citra trotoar dan lingkungan ITS beserta pelabelan atau pemberian deskripsi di setiap citra. Selanjutnya, dataset citra yang sudah diberi label akan diolah menjadi model image captioning dengan beberapa skenario implementasi deep learning. Metode deep learning yang diimplementasikan dalam kerja praktik ini meliputi LSTM, CNN, dan GRU. Hasil pengembangan model akan dibandingkan performanya menggunakan parameter BLEU Score dan ROUGE. Hasil evaluasi menunjukkan bahwa Dataset 1 menghasilkan performa terbaik dengan BLEU-1 51.49% dan ROUGE-1 40.05%. Metode CNN memberikan hasil terbaik dengan BLEU-1 25.56% dan ROUGE-1 23.44%. Sementara itu, fungsi aktivasi Relu memberikan performa terbaik pada hyperparameter tuning dengan BLEU-1 23.61% dan ROUGE-1 22.37%.
============================================================================================================================
This project was carried out at the Komputasi Cerdas dan Visi (KCV) Laboratory, Department of Informatics Engineering, Institut Teknologi Sepuluh Nopember (ITS). The activities carried out have the aim of producing an image captioning model by utilizing deep learning. This project begin with creating a dataset of images of ITS pavements and environments along with labeling or giving descriptions in each image. Furthermore, the labeled image dataset will be processed into an image captioning model with several deep learning implementation scenarios. The deep learning methods implemented in this practical work include LSTM, CNN, and GRU. The results of the model development will be compared using the BLEU Score and ROUGE parameters. The evaluation results show that Dataset 1 produces the best performance with BLEU-1 51.49% and ROUGE-1 40.05%. CNN method gives the best result with BLEU-1 25.56% and ROUGE-1 23.44%. Meanwhile, the Relu activation function gives the best performance in hyperparameter tuning with BLEU-1 23.61% and ROUGE-1 22.37%.

Item Type: Monograph (Project Report)
Uncontrolled Keywords: Image Captioning, Deep Learning, LSTM, CNN, GRU
Subjects: T Technology > T Technology (General) > T57.5 Data Processing
T Technology > T Technology (General) > T57.8 Nonlinear programming. Support vector machine. Wavelets. Hidden Markov models.
Divisions: Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Informatics Engineering > 55201-(S1) Undergraduate Thesis
Depositing User: Hammuda Arsyad
Date Deposited: 04 Feb 2025 06:56
Last Modified: 04 Feb 2025 06:56
URI: http://repository.its.ac.id/id/eprint/118175

Actions (login required)

View Item View Item