Perbandingan Algoritma Deep Deterministic Policy Gradient Dan Deep Q-Network Untuk Path Planning Robot OpenManipulator-X

Rasendra, Tio Khansa (2026) Perbandingan Algoritma Deep Deterministic Policy Gradient Dan Deep Q-Network Untuk Path Planning Robot OpenManipulator-X. Other thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 5022221008-Undergraduate_Thesis.pdf] Text
5022221008-Undergraduate_Thesis.pdf
Restricted to Repository staff only

Download (3MB) | Request a copy

Abstract

Penelitian ini bertujuan untuk membandingkan performa dua algoritma Reinforcement Learning, yaitu Deep Q-Network (DQN) yang menggunakan ruang aksi diskrit dan Deep Deterministic Policy Gradient (DDPG) dengan ruang aksi kontinyu dalam menyelesaikan misi perencanaan jalur pada robot OpenManipulator-X. Platform ini dipilih karena sifatnya yang open-source. Proses diawali dengan perancangan kinematika maju menggunakan parameter Denavit-Hartenberg (D-H) untuk menentukan posisi end-effector secara akurat. Kedua algoritma dilatih selama 5.000 episode menggunakan fungsi reward yang sama. Perbandingan dilakukan berdasarkan metrik Mean Reward, persentase keberhasilan, waktu komputasi pelatihan, serta panjang lintasan yang dihasilkan pada simulasi MATLAB dan implementasi perangkat keras nyata. Hasil penelitian menunjukkan bahwa DQN lebih efisien dalam fase pelatihan, mencapai konvergensi pada episode ke-1.000 dengan waktu 10 menit dan tingkat keberhasilan 75%. Sebaliknya, DDPG memerlukan waktu 45 menit dan belum konvergen sepenuhnya hingga 5.000 episode dengan keberhasilan 68%. Namun, dari sisi kualitas navigasi, DDPG secara konsisten mengungguli DQN dengan menghasilkan jalur yang lebih pendek 12,07% pada skenario satu rintangan dan hingga 35,00% lebih pendek pada skenario lima rintangan. Pengujian pada lingkungan nyata memperkuat keunggulan DDPG yang berhasil mencapai target tanpa menyentuh rintangan, sementara DQN mengalami tabrakan kecil pada rintangan di atas target. Dapat disimpulkan bahwa fleksibilitas ruang aksi kontinyu pada DDPG menjadikannya lebih ideal untuk sistem robotika yang menuntut presisi dan keamanan tinggi di lingkungan padat.
==================================================================================================================================
This study aims to compare the performance of two Reinforcement Learning algorithms, namely Deep Q-Network(DQN) utilizing a discrete action space and Deep Deterministic Policy Gradient (DDPG) with a continuous action space, in executing path planning missions for the OpenManipulator-X robot. This platform was selected due to its open-source nature. The process begins with designing forward kinematics using Denavit-Hartenberg (D-H) parameters to accurately determine the end-effector position. Both algorithms were trained for 5,000 episodes using an identical reward function. Comparison metrics include Mean Reward, success rate, training computation time, and the resulting path length in both MATLAB simulations and physical hardware implementation. The results indicate that DQN is more efficient during the training phase, achieving convergence by the 1,000th episode within 10 minutes and reaching a 75% success rate. In contrast, DDPG required 45 minutes and had not fully converged by the 5,000th episode, with a success rate of 68%. However, regarding navigation quality, DDPG consistently outperformed DQN by producing paths that were 12.07% shorter in single-obstacle scenarios and up to 35.00% shorter in five-obstacle scenarios. Real-world testing reinforced DDPG’s superiority, as it successfully reached the target without obstacle contact, whereas DQN experienced minor collisions with obstacles above the target. It can be concluded that the flexibility of the continuous action space in DDPG makes it more ideal for robotic systems requiring high precision and safety in dense environments.

Item Type: Thesis (Other)
Uncontrolled Keywords: Deep Deterministic Policy Gradient, Deep Q-Network, OpenManipulator-X, Path Planning, Reinforcement Learning, Deep Deterministic Policy Gradient, Deep-Q Network, OpenManipulator-X, Path Planning, Reinforcement Learning.
Subjects: T Technology > TJ Mechanical engineering and machinery > TJ211.4 Robot motion
Divisions: Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Electrical Engineering > 20201-(S1) Undergraduate Thesis
Depositing User: Tio Khansa Rasendra
Date Deposited: 29 Jan 2026 03:33
Last Modified: 29 Jan 2026 03:33
URI: http://repository.its.ac.id/id/eprint/130919

Actions (login required)

View Item View Item