Text Clustering Insiden Pada Lingkungan Kerja Warehouse Dengan Metode Density-Based Spatial Clustering Of Applications With Noise

Raharjo, Aditya Tri (2021) Text Clustering Insiden Pada Lingkungan Kerja Warehouse Dengan Metode Density-Based Spatial Clustering Of Applications With Noise. Undergraduate thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 06211740000045-Undergraduate_Thesis.pdf] Text
06211740000045-Undergraduate_Thesis.pdf - Accepted Version
Restricted to Repository staff only until 1 October 2023.

Download (2MB) | Request a copy

Abstract

Insiden merupakan sebuah hal yang tidak diinginkan oleh semua pihak. Hal itu dikarenakan insiden dapat menyebabkan kerugian, terutama kerugian secara material. Salah satu tempat yang sering mengalami insiden adalah warehouse. Sering kali insiden yang dilaporkan dalam bentuk uraian atau teks bebas sehingga sulit untuk melakukan identifikasi insiden yang terjadi. Oleh karena itu dilakukan penelitian tentang analisis text clustering untuk mengelompokkan insiden yang terjadi di warehouse agar memudahkan perusahaan dalam mengidentifikasi insiden yang terjadi. Penelitian ini menggunakan metode DBSCAN dengan menggunakan kombinasi kata unigram dan bigram serta membandingkan antara menggunakan feature selection menggunakan Genetic Algorithm dan yang tidak juga dengan penelitian terdahulu. Selain itu juga dilakukan eksplorasi data untuk mengetahui karakteristik insiden yang terjadi. Hasilnya adalah feature selection berpengaruh signifikan terhadap pengurangan noise yang dihasilkan oleh DBSCAN dan penggunaan kombinasi kata unigram dan bigram tidak berpengaruh signifikan terhadap kualitas klaster. Pada sektor Technology dan Retail klaster terbaik terbentuk berdasarkan kombinasi kata unigram dengan menggunakan feature selection. Sektor SPL (Spare Part Logistic) menghasilkan klaster terbaik dengan kombinasi kata unigram tanpa feature selection. Untuk sektor Lifestyle, klaster terbaik terbentuk berdasarkan kombinasi kata bigram dengan menggunakan feature selection. Sedangkan untuk sektor Consumer klaster terbaik dihasilkan oleh penelitian terdahulu menggunakan Hierarchical Clustering.
====================================================================================================
Incident is something that is not desired by all parties. This is because incidents can cause losses, especially material losses. One of the places that often experience incidents is the warehouse. Often incidents are reported in the form of descriptions or free text making it difficult to identify the incidents that occurred. Therefore, research on text clustering analysis was carried out to classify incidents that occurred in the warehouse to make it easier for company to identify incidents that occurred. This project uses the DBSCAN method using a combination of unigram and bigram words and compares between using feature selection using Genetic Algorithm and those not also previous research. In addition, data exploration was also carried out to determine the characteristics of the incident that occurred. The result is that feature selection has a significant effect on reducing noise generated by DBSCAN and the use of a combination of unigram and bigram words has no significant effect on cluster quality. In the Technology and Retail sectors, the best clusters are formed based on a combination of unigram words using feature selection. The SPL (Spare Part Logistics) sector produces the best clusters with unigram word combinations without feature selection. For the Lifestyle sector, the best clusters are formed based on the combination of bigram words using feature selection. As for the Consumer sector, the best clusters were produced by previous research using Hierarchical Clustering.

Item Type: Thesis (Undergraduate)
Uncontrolled Keywords: Feature Selection, Insiden, Kombinasi Kata, Text Clustering, Warehouse, Feature Selection, Incident, Text Clustering, Warehouse, Word Combination
Subjects: H Social Sciences > HA Statistics
H Social Sciences > HA Statistics > HA30.6 Spatial analysis
Q Science > QA Mathematics > QA278 Cluster Analysis. Multivariate analysis. Correspondence analysis (Statistics)
Divisions: Faculty of Science and Data Analytics (SCIENTICS) > Statistics > 49201-(S1) Undergraduate Thesis
Depositing User: Aditya Tri Raharjo
Date Deposited: 03 Sep 2021 02:31
Last Modified: 03 Sep 2021 02:31
URI: http://repository.its.ac.id/id/eprint/91440

Actions (login required)

View Item View Item