Priyanto, Akbar Putra Asenti and Pandya, Duevano Fairuz (2024) Eksplorasi Image Caption Generation Pada Dataset Fashion Menggunakan Model BLIP-2. Project Report. [s.n]. (Unpublished)
Text
5025211004_5025211052-Project_Report.pdf - Accepted Version Restricted to Repository staff only Download (1MB) | Request a copy |
Abstract
Industri fashion memiliki minat yang sangat tinggi, namun para penjual sering menghadapi tantangan dalam membantu pelanggan memilih pakaian yang sesuai dari katalog mereka. Salah satu solusi yang dapat diterapkan adalah penggunaan caption text untuk mendeskripsikan pakaian. Namun, proses pembuatan caption secara manual memerlukan waktu dan tenaga yang besar, sehingga diperlukan otomatisasi. Pada Kerja Praktik ini, telah dilakukan eksplorasi kemampuan model BLIP-2 dalam menghasilkan caption text dataset pakaian bernama FACAD. Praproses pembeda yang dilakukan antara lain adalah padding pada gambar dan balancing. Hasil pengujian menunjukkan bahwa model BLIP-2 berhasil menghasilkan caption text yang memenuhi lima dari enam kriteria evaluasi yang telah ditentukan dengan nilai metrik BLEU sebesar 0,36 dan ROUGE-L sebesar 0,4 pada model terbaik.
==================================================================================================================================
The fashion industry has a very high level of interest, but sellers often face challenges in helping customers select suitable clothing from their catalogs. One potential solution is the use of caption text to describe clothing. However, manually creating captions requires significant time and effort, necessitating automation. In this internship, the capabilities of the BLIP-2 model were explored to generate caption text for a clothing dataset named FACAD. Preprocessing steps included image padding and balancing. Testing results showed that the BLIP-2 model successfully generated caption text that met five out of six predefined evaluation criteria, achieving a BLEU metric score of 0.36 and a ROUGE-L score of 0.4 on the best model.
Item Type: | Monograph (Project Report) |
---|---|
Uncontrolled Keywords: | image caption, fashion, BLIP-2, pembangkitan deskripsi gambar, pakaian. |
Subjects: | T Technology > T Technology (General) > T57.5 Data Processing T Technology > T Technology (General) > T59.7 Human-machine systems. |
Divisions: | Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Informatics Engineering > 55201-(S1) Undergraduate Thesis |
Depositing User: | Akbar Putra Asenti Priyanto |
Date Deposited: | 23 Dec 2024 07:26 |
Last Modified: | 23 Dec 2024 07:26 |
URI: | http://repository.its.ac.id/id/eprint/116039 |
Actions (login required)
View Item |