Peningkatan Kinerja Chatbot Pada Arsitektur Sistem Terdistribusi Menggunakan Multi-Gpu Dan Framework Ray

Pambudi, Robby Ulung (2025) Peningkatan Kinerja Chatbot Pada Arsitektur Sistem Terdistribusi Menggunakan Multi-Gpu Dan Framework Ray. Other thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 5025211042-Undergraduate_Thesis.pdf] Text
5025211042-Undergraduate_Thesis.pdf - Accepted Version
Restricted to Repository staff only

Download (4MB) | Request a copy

Abstract

Penelitian ini mengembangkan sistem chatbot berbasis arsitektur terdistribusi menggunakan teknologi multi-GPU dan framework Ray. Sistem mengintegrasikan Ray sebagai framework orchestrator distribusi tugas, vLLM sebagai engine inference untuk model bahasa besar, text embedding berbasis Sentence Transformers untuk pemrosesan semantik, dan ChromaDB sebagai vector database. Ray berperan sebagai koordinator utama yang mengelola distribusi beban kerja secara seamless. Pengujian dilakukan pada tiga model dengan ukuran parameter berbeda (0.5B, 1.5B, dan 3B parameter) menggunakan tiga konfigurasi: Single GPU, Multi-GPU, dan Multi-GPU Multi-Node. Hasil menunjukkan arsitektur sistem terdistribusi berbasis multi-GPU berhasil meningkatkan kinerja sistem chatbot dengan peningkatan throughput hingga 15-20% pada konfigurasi multi-node multi-GPU dibandingkan single GPU. Konfigurasi Multi-GPU menunjukkan performa optimal dalam utilisasi GPU, mencapai utilisasi tertinggi 30% pada model 3B parameter dan berhasil mengoptimalkan penggunaan memori GPU dari 18 GB menjadi sekitar 13 GB per GPU melalui distribusi beban kerja. Konfigurasi Multi-GPU Multi-Node menunjukkan utilisasi GPU paling efisien dengan nilai 4 10% untuk semua ukuran model, utilisasi CPU konsisten 77-78%, dan berhasil mengoptimalkan penggunaan RAM sistem menjadi 8-9 GB. Penelitian ini membuktikan efektivitas arsitektur sistem terdistribusi dalam mengoptimalkan parallel processing untuk berbagai jenis beban kerja chatbot.
=================================================================================================================================
This research develops a distributed architecture-based chatbot system using multi-GPU technology and Ray framework. The system integrates Ray as a task distribution orchestrator framework, vLLM as an inference engine for large language models, Sentence Transformers based text embedding for semantic processing, and ChromaDB as a vector database. Ray acts as the main coordinator that manages the workload distribution seamlessly. Tests were conducted on three models with different parameter sizes (0.5B, 1.5B, and 3B parameters) using three configurations: Single GPU, Multi-GPU, and Multi-GPU Multi-Node. The results show that the multi-GPU-based distributed system architecture successfully improves the performance of the chatbot system with an increase in throughput of up to 15-20% in the multi GPU multi-node configuration compared to the single GPU. The Multi-GPU configuration showed optimal performance in GPU utilization, reaching the highest utilization of 30% in the 3B parameter model and successfully optimizing GPU memory usage from 18 GB to around 13 GB per GPU through workload distribution. The Multi-GPU Multi-Node configuration showed the most efficient GPU utilization with values of 4-10% for all model sizes, consistent CPU utilization of 77-78%, and successfully optimized system RAM usage to 8-9 GB. This research proves the effectiveness of distributed system architecture in optimizing parallel processing for various types of chatbot workloads.

Item Type: Thesis (Other)
Uncontrolled Keywords: Chatbot, Sistem Terdistribusi, multi-GPU, Ray Framework, vLLM, Chatbot, Distributed System, multi-GPU, Ray Framework, vLLM
Subjects: T Technology > T Technology (General)
Divisions: Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Informatics Engineering > 55201-(S1) Undergraduate Thesis
Depositing User: Robby Ulung Pambudi
Date Deposited: 30 Jul 2025 03:35
Last Modified: 30 Jul 2025 03:35
URI: http://repository.its.ac.id/id/eprint/124068

Actions (login required)

View Item View Item