Jasir, Abdullah Nasih (2025) Penemuan Kembali Informasi Pada Al-Qur`an Dengan Metode Klastering. Other thesis, Institut Teknologi Sepuluh Nopember.
![]() |
Text
5025211111-Undergraduate_Thesis.pdf - Accepted Version Restricted to Repository staff only Download (9MB) | Request a copy |
Abstract
Al-Qur’an merupakan kitab suci umat Islam yang mencakup berbagai aspek kehidupan. Namun, banyaknya jumlah ayat dan penyebaran topik dalam surah yang berbeda-beda menyulitkan pencarian informasi tertentu secara efisien. Sistem penemuan kembali informasi (Information Retrieval/IR) dapat digunakan untuk menemukan ayat-ayat yang relevan dengan topik tertentu. Pendekatan berbasis kata kunci (keyword-based) dinilai masih memiliki keterbatasan karena hanya mengandalkan kesamaan tekstual, bukan makna. Penelitian Tugas Akhir ini bertujuan mengeksplorasi manfaat metode klastering dalam sistem penemuan kembali informasi pada Al-Qur’an. Data yang digunakan adalah terjemahan ayat Al-Qur’an dalam Bahasa Indonesia. Empat metode ekstraksi fitur diterapkan, yaitu Word2Vec, BERT, TF-IDF, serta gabungan TF-IDF, LDA, dan PCA. Dalam pengelompokan terjemahan ayat Al-Qur’an, digunakan tiga algoritma klastering, yaitu K-Means, Agglomerative Hierarchical Clustering (AHC), dan DBSCAN. Evaluasi kualitas klaster dilakukan menggunakan tiga metrik, yaitu Silhouette Score, Davies-Bouldin Index (DBI), dan Entropi. Sementara itu, sistem IR dievaluasi menggunakan metrik Precision dan Recall. Berdasarkan metrik silhouette score, kombinasi DBSCAN dan Word2Vec menghasilkan performa terbaik dengan nilai sebesar 0,65392; metrik DBI, kombinasi K-Means dan Word2Vec menghasilkan performa terbaik dengan nilai sebesar 0,63647; dan metrik entropi, kombinasi AHC dan TF-IDF + LDA + PCA menghasilkan performa terbaik dengan nilai sebesar 1,92258. Sementara itu, evaluasi sistem IR menunjukkan bahwa kombinasi terbaik diperoleh oleh DBSCAN dengan fitur TF-IDF, yang menghasilkan rata-rata precision sebesar 0,66 dan recall sebesar 0,955.
====================================================================================================================================
The Qur'an is the holy book of Islam and covers various aspects of life. However, the large number of verses and the distribution of topics across different surahs often make it difficult to find specific information efficiently. Information Retrieval (IR) systems can be used to help users find verses that are relevant to a certain topic. Traditional keyword-based approaches are limited, as they rely heavily on textual similarity rather than actual meaning. This final project aims to explore the beneficial of clustering methods in an IR system for the Qur'an. The dataset used consists of Indonesian translations of Qur'anic verses. Four feature extraction methods were applied, such as Word2Vec, BERT, TF-IDF, and a combination of TF-IDF, LDA, and PCA. To cluster the verses, three clustering algorithms were used, such as K-Means, Agglomerative Hierarchical Clustering (AHC), and DBSCAN. Cluster quality was evaluated using Silhouette Score, Davies-Bouldin Index (DBI), and Entropy. Meanwhile, the IR system was evaluated using Precision and Recall. Based on the silhouette score metric, the combination of DBSCAN and Word2Vec produces the best performance with a value of 0.65392; the DBI metric, the combination of K-Means and Word2Vec produces the best performance with a value of 0.63647; and the entropy metric, the combination of AHC and TF-IDF + LDA + PCA produces the best performance with a value of 1.92258. In terms of IR performance, the best combination was DBSCAN with TF-IDF features, achieving an average of precision of 0.66 and recall of 0.955.
Item Type: | Thesis (Other) |
---|---|
Uncontrolled Keywords: | Al-Qur’an, Cosine Similarity, Ekstraksi Fitur, Klastering; Penemuan Kembali Informasi, Al-Quran, Clustering, Cosine Similarity, Feature Extraction, Information Retrieval. |
Subjects: | T Technology > T Technology (General) |
Divisions: | Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Informatics Engineering > 55201-(S1) Undergraduate Thesis |
Depositing User: | Abdullah Nasih Jasir |
Date Deposited: | 28 Jul 2025 07:19 |
Last Modified: | 28 Jul 2025 07:19 |
URI: | http://repository.its.ac.id/id/eprint/122148 |
Actions (login required)
![]() |
View Item |