Pendekatan Berbasis Jaringan Saraf dan Graf untuk Memahami Daya Tarik Visual Foto Makanan

Fernanda, Muhamad Faiz (2025) Pendekatan Berbasis Jaringan Saraf dan Graf untuk Memahami Daya Tarik Visual Foto Makanan. Other thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 5025211186-Undergraduate_Thesis.pdf] Text
5025211186-Undergraduate_Thesis.pdf - Accepted Version

Download (6MB)

Abstract

Di era digital, foto makanan telah menjadi elemen penting dalam pemasaran kuliner, terutama melalui media sosial. Namun, estetika foto makanan seringkali bersifat subjektif dan sulit dievaluasi secara otomatis. Tugas Akhir ini bertujuan untuk mengatasi masalah evaluasi estetika foto makanan dengan fokus pada pemahaman daya tarik visual secara keseluruhan. Metode Graph Convolutional Network (GCN) dan Graph Attention Network (GAT) diterapkan untuk menganalisis hubungan kompleks antar elemen visual dalam gambar. Metodologi Tugas Akhir mencakup ekstraksi fitur gambar menggunakan Convolutional Neural Network (CNN) yaitu InceptionResNet-V2, serta pemanfaatan fitur tambahan berupa layout dan layout. Seluruh fitur tersebut direpresentasikan dalam bentuk graf, kemudian diolah menggunakan model GCN untuk proses klasifikasi. Evaluasi model dilakukan menggunakan metrik seperti akurasi, precision, recall, dan F1-score. Dataset yang digunakan berasal dari Gourmet Photography Dataset (GPD) yang terdiri dari 24.000 gambar makanan. Hasil percobaan menunjukkan bahwa proporsi data latih 90% dan data uji 10% secara keseluruhan memberikan performa yang lebih unggul. Dari hasil ablasi fitur, diketahui bahwa kombinasi fitur CNN dengan layout memberikan performa terbaik, baik pada model GCN maupun GAT. Secara keseluruhan, model GAT menunjukkan hasil yang lebih baik dibandingkan GCN, dengan nilai akurasi sebesar 0,9075, precision 0,9388, recall 0,8886, dan F1-score sebesar 0,913. Temuan ini memperkuat bahwa fitur tekstur dari CNN berpengaruh besar dalam penilaian estetika, sementara fitur tambahan seperti layout dan color moments berfungsi sebagai pelengkap yang membantu meningkatkan performa model. Pendekatan ini menunjukkan bahwa integrasi antara jaringan saraf dan struktur graf dapat digunakan untuk menilai daya tarik visual gambar secara lebih objektif, terukur, dan sistematis.
========================================================================================================================================
In the digital era, food photography has become an important element in culinary marketing, especially through social media. However, the aesthetics of food photography are often subjective and difficult to evaluate automatically. This final project aims to address the problem of aesthetic evaluation of food photography by focusing on understanding the overall visual appeal. Graph Convolutional Network (GCN) and Graph Attention Network (GAT) methods are applied to analyze the complex relationships between visual elements in the image. The final project methodology includes image feature extraction using a Convolutional Neural Network (CNN), namely InceptionResNetV2, as well as the utilization of additional features in the form of layout and color moments. All these features are represented in graph form, then processed using a GCN model for the classification process. Model evaluation is carried out using metrics such as accuracy, precision, recall, and F1-score. The dataset used comes from the Gourmet Photography Dataset (GPD) which consists of 24,000 food images. The experimental results show that the proportion of 90% training data and 10% test data overall provides superior performance. From the feature ablation results, it is known that the combination of CNN features with layout provides the best performance, both in the GCN and GAT models. Overall, the GAT model shows better results than GCN, with an accuracy value of 0.9075, a precision of 0.9388, a recall of 0.8886, and an F1-score of 0.913. These findings confirm that texture features from CNN have a significant influence on aesthetic assessment, while additional features such as layout and color moments function as complements that help improve model performance. This approach shows that the integration between neural networks and graph structures can be used to assess the visual appeal of images in a more objective, measurable, and systematic manner.

Item Type: Thesis (Other)
Uncontrolled Keywords: Aesthetic, Convolutional Neural Network , Food Photos, Graph Convolutional Network, Graph Attention Network, InceptionResnetV2, Convolutional Neural Network, Estetika, Foto Makanan, Graph Convolutional Network, Graph Attention Network, InceptionResnetV2.
Subjects: T Technology > T Technology (General) > T57.5 Data Processing
T Technology > T Technology (General) > T57.74 Linear programming
T Technology > T Technology (General) > T58.6 Management information systems
T Technology > T Technology (General) > T58.62 Decision support systems
T Technology > T Technology (General) > T59.7 Human-machine systems.
Divisions: Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Informatics Engineering > 55201-(S1) Undergraduate Thesis
Depositing User: Muhamad Faiz Fernanda
Date Deposited: 29 Jul 2025 02:02
Last Modified: 29 Jul 2025 02:02
URI: http://repository.its.ac.id/id/eprint/122398

Actions (login required)

View Item View Item