Modifikasi Arsitektur Pix2pix Dengan Menambahkan Informasi Tepi Pada Kasus Image-to-image Translation

Ikhsan, Gusna (2024) Modifikasi Arsitektur Pix2pix Dengan Menambahkan Informasi Tepi Pada Kasus Image-to-image Translation. Masters thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 6025201020_Master_Thesis.pdf] Text
6025201020_Master_Thesis.pdf - Accepted Version
Restricted to Repository staff only until 1 October 2026.

Download (54MB) | Request a copy

Abstract

Image generation merupakan suatu teknik untuk membuat citra. Image-to- image translation adalah salah satu bidang image generation yang bertujuan untuk mentransformasikan citra sumber ke citra target. Fokus penelitian ini yaitu image- to-image translation pada transformasi citra sketsa wajah manusia ke citra wajah realistis manusia. Penerapan image-to-image translation dalam kepolisian berguna untuk mengidentifikasi pelaku tindak kejahatan melalui sketsa wajah dari keterangan korban. Generasi citra menggunakan metode Generative Adversarial Network (GAN) yang memanfaatkan deep learning dalam proses penghitungan bobotnya. Metode GAN mengalami perkembangan dalam penerapan pada kasus image-to-image translation yang bernama Conditional Generative Adversarial Network (cGAN). Salah satu tipe cGAN yang populer digunakan untuk kasus Image-to-image translation yaitu Pix2pix. Penelitian menjadi menarik, karena terdapat tantangan untuk meningkatkan akurasi dari kemiripan citra sketsa wajah ke citra wajah realistis. Masalah yang sering muncul yaitu hasil citra wajah realistis yang dihasilkan kabur dan tidak mirip dengan citra wajah asli target. Penelitian ini bertujuan untuk membandingkan hasil generasi citra dengan memodifikasi arsitektur Pix2pix dengan menambahkan beberapan metode informasi tepi agar didapatkan hasil transformasi citra sketsa ke citra wajah realistis yang tajam dan akurat. Ekstraksi informasi tepi dari citra didapatkan menggunakan beberapa metode yaitu Laplacian, Sobel dan Prewitt. Penambahan ekstraksi informasi tepi pada arsitektur Pix2pix dilakukan ketika citra groundtruth dan generasi citra palsu akan diproses oleh diskriminator. Sehingga ketika dilakukan training model pada diskriminator, informasi yang didapatkan lebih detail dan akurat. Pemodelan modifikasi arsitektur Pix2pix menggunakan dataset dari CUHK dengan total 188 citra wajah realistis yang terdiri 88 data citra latih dan 100 data citra tes. Pemodelan modifikasi arsitekstur memanfaatkan platform Google Colaboratory Plus dan dijalankan pelatihan model sebanyak 200 epoch. Evaluasi hasil Image generation menggunakan Structural Similarity Index (SSIM) yang berfungsi untuk mengukur keakuratan antara citra asli dan hasil generasi citra. Perbandingan generasi citra dengan penambahan infromasi tepi pada metode Pix2pix menggunakan SSIM didapatkan hasil terbaik data tes yaitu kombinasi Pix2pix+Prewitt dengan nilai rata-rata 0,7048.
======================================================================================================================================
Image generation is a technique for creating images. Image-to-image translation is a field of image generation which aims to transform a source image into a target image. The focus of this research is image-to-image translation on the transformation of sketch images of human faces into images of realistic human faces. The application of image-to-image translation in the police is useful for identifying perpetrators of crimes through facial sketches from victim statements. Image generation uses the Generative Adversarial Network (GAN) method which utilizes deep learning in the weight calculation process. The GAN method has experienced developments in application in the case of image-to-image translation, namely Conditional Generative Adversarial Network (cGAN). One type of cGAN that is popularly used for image-to-image translation cases is Pix2pix. The research is interesting, because there is a challenge to increase the accuracy of the similarity of facial sketch images to realistic facial images. The problem that often arises is that the resulting realistic facial image is blurry and does not resemble the target's real facial image. This research aims to compare the results of image generation by modifying the Pix2pix architecture by adding several edge information methods to obtain sharp and accurate results from transforming sketch images into realistic facial images. Extraction of edge information from images is obtained using several methods, namely Laplacian, Sobel and Prewitt. The addition of edge information extraction in the Pix2pix architecture is carried out when the groundtruth image and fake image generation will be processed by the discriminator. So that when the model is trained on the discriminator, the information obtained is more detailed and accurate. The Pix2pix architecture modification modeling uses a dataset from CUHK with a total of 188 realistic face images consisting of 88 training image data and 100 test image data. Architectural modification modeling utilized the Google Colaboratory Plus platform and model training was carried out for 200 epochs. Evaluation of Image generation results uses the Structural Similarity Index (SSIM) which functions to measure the accuracy between the original image and the image generation results. Comparison of image generation with the addition of edge information in the Pix2pix method using SSIM obtained the best test data results, namely the Pix2pix+Prewitt combination with an average of 0,7048.

Item Type: Thesis (Masters)
Uncontrolled Keywords: image generation, citra, cGAN, Pix2pix. image generation, image, cGAN, Pix2pix.
Subjects: T Technology > T Technology (General)
T Technology > T Technology (General) > T57.5 Data Processing
Divisions: Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Informatics Engineering > 55101-(S2) Master Thesis
Depositing User: Gusna Ikhsan
Date Deposited: 10 Aug 2024 15:16
Last Modified: 26 Aug 2024 02:52
URI: http://repository.its.ac.id/id/eprint/114293

Actions (login required)

View Item View Item