Sakti, Muhammad Haikal Aria (2025) Analisis Metode Clustering untuk Pengelompokkan Dataset Retail. Other thesis, Institut Teknologi Sepuluh Nopember.
![]() |
Text
05111940000088-Undergraduate_Thesis.pdf - Accepted Version Restricted to Repository staff only until 1 April 2027. Download (3MB) | Request a copy |
Abstract
Perkembangan teknologi informasi saat ini telah membawa dampak signifikan dalam operasional bisnis, khususnya di sektor ritel. Akses yang lebih mudah terhadap data digital memungkinkan perusahaan ritel untuk mengumpulkan data penjualan yang besar setiap harinya, yang berisi informasi berharga mengenai perilaku dan preferensi pelanggan. Namun, banyaknya data yang terkumpul menjadi tantangan tersendiri dalam pengelolaannya. Oleh karena itu, diperlukan metode analisis yang tepat untuk mengubah data tersebut menjadi informasi yang bermanfaat.
Penelitian ini bertujuan untuk menganalisis dan membandingkan tiga metode clustering DBSCAN, Agglomerative Hierarchical Clustering (AHC), dan K-Means dalam pengelompokan data penjualan ritel. Penelitian ini menggunakan metrik evaluasi Silhouette Score dan Davies- Bouldin Index untuk membandingkan kinerja tiga metode clustering. Tahapan penelitan meliputi pra process data, proses clustering dengan manual tunning dan optimasi Hyperparameter.
Hasil penelitian ini menunjukkan bahwa metode Agglomerative Hierarchical Clustering (AHC) dengan manual tuning adalah metode clustering terbaik untuk dataset retail ini, dengan Silhouette Score tertinggi sebesar 0.743 dan Davies-Bouldin Index (DBI) terendah sebesar
0.181. Metode AHC menghasilkan 2 cluster dalam tunning manual dan hyperparameter namun tunning manual memiliki distribusi cluster yang lebih seimbang dibandingkan dengan hyperparameter tuning. Cluster yang dihasilkan mencerminkan pola penjualan berdasarkan total harga penjualan, diskon, jumlah produk terjual, dan harga satuan produk. Dengan segmentasi yang lebih jelas, metode AHC direkomendasikan untuk analisis data serupa guna membantu pengambil keputusan dalam merancang strategi bisnis yang lebih efektif dan berbasis data.
====================================================================================================================================
The development of information technology has had a significant impact on business operations, particularly in the retail sector. Easier access to digital data allows retail companies to collect large volumes of sales data daily, containing valuable information about customer behavior and preferences. However, managing the vast amount of data presents its own challenges. Therefore, appropriate analytical methods are required to transform the data into useful insights.
This study aims to analyze and compare three clustering methods DBSCAN, Agglomerative Hierarchical Clustering (AHC), and K-Means in clustering retail sales data. This research uses Silhouette Score and Davies-Bouldin Index evaluation metrics to compare the performance of three clustering methods. The research stages include data pre-processing, clustering process with manual tunning and Hyperparameter optimization.
The results of this study indicate that the Agglomerative Hierarchical Clustering (AHC) method with manual tuning is the best clustering method for this retail dataset, achieving the highest Silhouette Score of 0.743 and the lowest Davies-Bouldin Index (DBI) of 0.181. The AHC method produces two clusters in both manual tuning and hyperparameter tuning, but manual tuning provides a more balanced cluster distribution compared to hyperparameter tuning. The resulting clusters reflect sales patterns based on total sales price, discounts, the number of products sold, and the unit price of products. With clearer segmentation, the AHC method is recommended for similar data analysis to assist decision-makers in designing more effective and data-driven business strategies.
Item Type: | Thesis (Other) |
---|---|
Uncontrolled Keywords: | Clustering, DBSCAN, Agglomerative, K-Means, Analisis Data Penjualan Ritel, Silhouette Score, Davies-Bouldin Index Clustering, DBSCAN, Agglomerative, K-Means, Retail Sales Data Analysis, Silhouette Score, Davies-Bouldin Index. |
Subjects: | T Technology > T Technology (General) > T385 Visualization--Technique T Technology > T Technology (General) > T57.5 Data Processing T Technology > T Technology (General) > T58.64 Information resources management T Technology > T Technology (General) > T58.8 Productivity. Efficiency |
Divisions: | Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Informatics Engineering > 55201-(S1) Undergraduate Thesis |
Depositing User: | Muhammad Haikal Aria Sakti |
Date Deposited: | 02 Feb 2025 02:30 |
Last Modified: | 02 Feb 2025 02:30 |
URI: | http://repository.its.ac.id/id/eprint/117078 |
Actions (login required)
![]() |
View Item |