Tsaniya, Hilya (2022) Medical Image Captioning Menggunakan Multi-Head Attention Dengan Gamma Correction. Masters thesis, Institut Teknologi Sepuluh nopember.
Text
6025211016-Master_Thesis.pdf - Accepted Version Restricted to Repository staff only until 1 April 2025. Download (3MB) |
Abstract
Pengembangan sistem medical image captioning memiliki tujuan utama untuk membantu ahli radiologi dalam penulisan laporan medis dengan tingkat kekeliruan rendah. Selain itu, juga dapat digunakan sebagai alternatif pada situasi kekurangan ahli profesional. Penelitian terkait medical image captioning memiliki keterbatasan pada koherensi kemiripan teks antara teks prediksi dan teks referensi yang rendah. Salah satu penyebabnya, sering terjadinya noise saat proses akuisisi citra medis. Penelitian ini bertujuan mengembangkan sistem medical image captioning menggunakan transformer dengan peningkatan kualitas citra. Metode yang digunakan berbasis encoder-decoder dengan ekstraksi fitur visual menggunakan Convolutional Neural Network (CNN) dan decoder Long Short Term Memory (LSTM) dengan mekanisme multi-head attention dan pembobotan BERT. Metode gamma correction digunakan untuk perbaikan citra agar meningkatkan kinerja model yang digunakan. Dari hasil uji coba, berdasarkan evaluasi Bilingual Evaluation Understudy (BLEU) metode multi-head attention dengan perbaikan citra menggunakan gamma correction memberikan hasil prediksi 15% lebih baik dibanding dengan baseline model CNN-LSTM. Dibandingkan dengan data citra original, citra yang telah diperbaiki menggunakan gamma correction juga meningkatkan kinerja 5% lebih baik pada metode multi-head attention. Berdasarkan uji coba, penggunaan gamma correction dan multi-head attention memberi pengaruh dan hasil prediksi yang baik pada kinerja model dalam melakukan interpretasi citra
=============================================================================================================================
The development of a medical image captioning system has the main objective of assisting radiologists in writing medical reports with a low error rate. In addition, it can also be used as a learning and alternative in situations of shortage of professional experts. Research development related to medical image captioning has limitations on coherence and low evaluation of text similarity between predictive text and reference text. One reason is the frequent occurrence of noise during the medical image acquisition process. This study aims to develop a medical image captioning system using the transformer method to improve image quality. The method used is encoder-decoder based with visual feature extraction using a Convolutional Neural Network (CNN) and a Long Short Term Memory (LSTM) decoder with a multi-head attention mechanism and BERT weighting. The gamma correction method is used for image improvement in order to improve the performance of the model used. From the experiment results, based on the evaluation of the Bilingual Evaluation Understudy (BLEU) the multi-head attention method with image improvement using gamma correction gives 15% better predictive results than the CNN-LSTM baseline model. The use of gamma correction also increases performance by 5% better on the multi-head attention method. Based on the experiments, the use of gamma correction and multi-head attention improve model performance to produce better coherence in predicting diagnoses.
Item Type: | Thesis (Masters) |
---|---|
Uncontrolled Keywords: | Medical image captioning, Multi-head attention, Gamma correction, BERT Embedding Medical image captioning, Multi-head attention, Gamma correction, BERT Embedding. |
Subjects: | T Technology > T Technology (General) |
Divisions: | Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Informatics Engineering > 55101-(S2) Master Thesis |
Depositing User: | Hilya Tsaniya |
Date Deposited: | 01 Feb 2023 03:55 |
Last Modified: | 01 Feb 2023 03:57 |
URI: | http://repository.its.ac.id/id/eprint/95951 |
Actions (login required)
View Item |