Perbaikan Algoritma Charm Untuk Penggalian Frequent Closed Itemsets

Mardiyanto, Mardiyanto (2007) Perbaikan Algoritma Charm Untuk Penggalian Frequent Closed Itemsets. Masters thesis, Institut Teknologi Sepuluh Nopember Surabaya.

[thumbnail of 5105201701-Master_Thesis.pdf] Text
5105201701-Master_Thesis.pdf - Accepted Version

Download (7MB)

Abstract

Penggalian frequent closed itemsets merupakan salah satu bagian penting dari penggalian kaidah asosiasi (association rule) karena dapat secara unik menentukan himpunan semua frequent itemsets dan supportnya. Berbagai algoritma penggalianfrequent closed itemsets telah ditemukan, diantaranya adalah algoritma CHARM dan algoritma DCI_ CLOSED. Algoritma CHARM menggunakan format data vertikal diffset dan metode subsumption check untuk melakukan pemeriksaan duplikasi. Metode subsumption check tidak efisien karena memerlukan penyimpanan semuafrequent closed itemsets sebelumnya di memori utama. Algoritma DCI_CLOSED menggunakan format data vertikal bitvectors dan menggunakan metode order preserving untuk melakukan pemeriksaan duplikasi. Metode order preserving efisien karena tidak memerlukan penyimpananfrequent closed itemsets sebelumnya di memori utama. Berdasarkan penelitian dan teori yang berkaitan dengan penggalian frequent closed itemsets, belum ada algoritma yang mengintegrasikan penggunaan format data vertical diffset dan pemeriksaan duplikasi tanpa melakukan penyimpanan semuafrequent closed itemsets sebelumnya. Sehingga ada peluang penelitian untuk merancang perbaikan algoritma CHARM yang lebih efisien penggunaan memorinya. Metodologi yang digunakan dalam penelitian ini berkaitan dengan perancangan perbaikan algoritma CHARM yang lebih efisien penggunaan memorinya selama proses enumerasi frequent closed itemsets. Metode yang digunakan adalah menggabungkan antara metode subsumption check pada cabang yang sedang dienumerasi dan metode order preserving dalam melakukan pemeriksaan duplikasi. Algoritma yang dirancang kemudian diimplementasikan dengan menggunakan bahasa pemrograman Visual C++ dan diuji dengan berbagai macam jenis data uji coba. Untuk mengukur keberhasilan rancangan algoritma, maka digunakan dua skenario uji coba seperti berikut: perbandingan efisiensi penggunaan memori dari algoritma yang dirancang terhadap algoritma CHARM dan perbandingan karakteristik waktu komputasi algoritma yang dirancang terhadap algoritma CHARM. Hasil uji coba menunjukkan bahwa perbaikan algoritma CHARM yang telah dilakukan mampu meningkatkan efisiensi penggunaan memori dibandingkan dengan algoritma CHARM untuk nilai minimum support yang semakin kecil. Namun demikian, waktu komputasi dari algoritma yang telah diperbaiki tidak memberikan indikasi kuat untuk dikatakan lebih cepat dibandingkan dengan algoritma CHARM
=====================================================================================================================================
Extracting frequent closed itemsets is an important part of extracting association rules because it can uniquely determine the set of all frequent itemsets and their supports. Various algorithms for extracting frequent closed itemsets have been discovered, including the CHARM algorithm and the DCI_ CLOSED algorithm. The CHARM algorithm uses a vertical diffset data format and a subsumption check method to perform duplication checks. The subsumption check method is inefficient because it requires storing all previously frequently closed itemsets in main memory. The DCI_CLOSED algorithm uses the bitvectors vertical data format and uses an order preserving method to perform duplication checking. The order preserving method is efficient because it does not require storing previously frequently closed itemsets in main memory. Based on research and theory related to extracting frequent closed itemsets, there is no algorithm that integrates the use of vertical diffset data format and duplication checking without storing all previous frequent closed itemsets. So there is a research opportunity to design improvements to the CHARM algorithm that use memory more efficiently. The methodology used in this research is related to designing improvements to the CHARM algorithm that uses memory more efficiently during the frequent closed itemsets enumeration process. The method used is a combination of the subsumption check method for the branch being enumerated and the order preserving method for carrying out duplication checks. The designed algorithm was then implemented using the Visual C++ programming language and tested with various types of test data. To measure the success of the algorithm design, two test scenarios are used as follows: a comparison of the memory usage efficiency of the designed algorithm against the CHARM algorithm and a comparison of the computing time characteristics of the designed algorithm against the CHARM algorithm. The test results show that the improvements to the CHARM algorithm that have been made are able to increase the efficiency of memory use compared to the CHARM algorithm for smaller minimum support values. However, the computational time of the improved algorithm does not provide a strong indication that it is faster than the CHARM algorithm

Item Type: Thesis (Masters)
Additional Information: RTIf 005.1 Mar p-1 2007
Uncontrolled Keywords: penggalian frequent closed itemset, perbaikan algoritma CHARM, format data vertikal Diffset; extracting frequent closed itemsets, improving the CHARM algorithm, Diffset vertical data format
Subjects: Q Science > QA Mathematics > QA9.58 Algorithms
Divisions: Faculty of Information Technology > Informatics Engineering > 55101-(S2) Master Thesis
Depositing User: EKO BUDI RAHARJO
Date Deposited: 01 Jul 2024 06:01
Last Modified: 01 Jul 2024 06:01
URI: http://repository.its.ac.id/id/eprint/108097

Actions (login required)

View Item View Item