Pembangkitan Kaidah Asosiasi Dari Top-K Frequent Closed Itemset Yang Didasarkan Pada Struktur Data Berbasis Lattice

Hapsari, Dian Puspita (2008) Pembangkitan Kaidah Asosiasi Dari Top-K Frequent Closed Itemset Yang Didasarkan Pada Struktur Data Berbasis Lattice. Masters thesis, Institut Teknologi Sepuluh Nopember Surabaya.

[thumbnail of 5104201023-Master_Thesis.pdf] Text
5104201023-Master_Thesis.pdf - Accepted Version

Download (8MB)

Abstract

Proses penggalian kaidah asosiasi pada dasamya terdiri dari dua tahapan utama, yaitu pembangkitan item-item yang sering muncul dan pembangkitan kaidah-kaidah asosiasi. Dalam penelitian sebelumnya, terdapat dua algoritma utama untuk proses pembangkitan item-item yang sering muncul, yaitu algoritma yang berbasis struktur data lattice seperti Algoritma CHARM-L (Closed Association Rule Mining-Lattice) dan algoritma yang berbasis struktur data FP-Tree seperti Algoritma TFP (Top Frequent Pattern). Dari penelitian yang berkaitan dengan penggalian.frequent closed itemsets, belum pernah dikembangkan algoritma berbasis struktur data lattice menggunakan batasan top-k sebagai pengganti penetapan nilai minimum support, sehingga terdapat peluang dilakukannya penelitian untuk mendesain dan mengimplementasikan algoritma penggalian top-kfrequent closed itemset berbasis struktur data lattice. Dalam penelitian ini dikembangkan algoritma pembangkitan kaidah asosiasi dari top-kfrequent closed itemset berbasis struktur data lattice. Dalam penelitian ini, nilai topk digunakan sebagai pengganti nilai batasan minimum support sebagai acuan derajat kemunculan sebuah itemset dari frequent itemset yang paling dicari atau paling diinginkan oleh pengguna. Untuk ini, pengguna hanya diminta memasukkan nilai bilangan bulat postif k untuk membangkitkan semua kaidah asosiasi dari top-k frequent closed itemset yang dapat dibangkitkan sesuai dengan nilai minimum confidence atau tingkat kore1asi yang ditentukan oleh pengguna. Algoritma yang berhail dikembangkan dalam penelitian ini diimplementasikan dalam lingkungan sistem operasi Windows dengan menggunakan bahasa pemrograman C. Kinerja waktu komputasi dalam membangkitkan top-k frequent closed itemset dari algoritma yang dikembangkan dibandingkan dengan algoritma pembanding sejenis yang membangkitkan top-k frequent closed itemset berbasis struktur data frequent-pattern tree (algoritma TFP). Hasil uji coba menunjukkan bahwa algoritma yang telah berhasil diimplementasikan mampu menghasilkan semua kaidah asosiasi yang diingikan oleh pengguna dari top-k frequent closed itemset yang dibangkitkan sesuai dengan nilai k yang diberikan. Walaupun kinerja waktu komputasi pembangkitan top-k frequent closed itemset dari algoritma yang dikembangkan sedikit lebih lambat dibandingkan dengan algoritma pembandingnya, tetapi struktur data lattice yang dihasilkan dapat memudahkan penentuan hubungan subset dan superset antar itemset dalam proses pembangkitan kaidah asosiasi. Kelebihan ini terutama berguna untuk mengurangi ruang pencarian kandidat itemset yang dianggap infrequent, sehingga dapat mempercepat proses pembangkitan kaidah asosiasi
======================================================================================================================================
The process of extracting association rules basically consists of two main stages, namely generating frequently occurring items and generating association rules. In previous research, there were two main algorithms for the process of generating frequently appearing items, namely algorithms based on lattice data structures such as the CHARM-L (Closed Association Rule Mining-Lattice) algorithm and algorithms based on FP-Tree data structures such as the TFP Algorithm. (Top Frequent Pattern). From research related to extracting frequently closed itemsets, an algorithm based on a lattice data structure has never been developed using top-k constraints as a substitute for determining the minimum support value, so there is an opportunity for research to design and implement a data structure-based top-kfrequent closed itemset mining algorithm. lattice. In this research, an algorithm for generating association rules from top-kfrequent closed itemsets based on a lattice data structure was developed. In this research, the topk value is used as a substitute for the minimum support limit value as a reference for the degree of appearance of an itemset from the frequent itemsets that are most sought after or most desired by users. For this, the user is only asked to enter a positive integer value k to generate all association rules from the top-k frequently closed itemsets that can be generated according to the minimum confidence value or correlation level determined by the user. The algorithm that was successfully developed in this research was implemented in the Windows operating system environment using the C programming language. The computing time performance in generating top-k frequent closed itemsets from the algorithm developed was compared with a similar comparative algorithm that generates top-k frequent closed itemsets based on data structures frequent-pattern tree (TFP algorithm). The test results show that the algorithm that has been successfully implemented is able to produce all the association rules desired by the user from the top-k frequently closed itemsets that are generated according to the given k value. Although the computational time performance for generating top-k frequent closed itemsets from the algorithm developed is slightly slower than the comparable algorithm, the resulting lattice data structure can make it easier to determine subset and superset relationships between itemsets in the process of generating association rules. This advantage is especially useful for reducing the search space for candidate itemsets that are considered infrequent, so that it can speed up the process of generating association rules

Item Type: Thesis (Masters)
Additional Information: RTIf 005.1 Hap p-1 2008
Uncontrolled Keywords: top-k frequent closed itemset, struktur data lattice, kaidah asosiasi, data mining; top-k frequent closed itemset, lattice data structure, association rules, data mining
Subjects: Q Science > QA Mathematics > QA76.758 Software engineering
Divisions: Faculty of Information Technology > Informatics Engineering > 55101-(S2) Master Thesis
Depositing User: EKO BUDI RAHARJO
Date Deposited: 01 Jul 2024 02:21
Last Modified: 01 Jul 2024 03:12
URI: http://repository.its.ac.id/id/eprint/108093

Actions (login required)

View Item View Item