Mustofa, Reza (2019) Topic Discovery pada Jurnal-jurnal Penelitian di IEEE Explore Menggunakan Association Rule Mining dengan Pendekatan Closed Frequent Itemset. Other thesis, Institut Teknologi Sepuluh Nopember.
Preview |
Text
06211745000024-Undergraduate_Theses.pdf Download (1MB) | Preview |
Abstract
Menemukan topik dari koleksi dokumen seperti publikasi ilmiah mempunyai banyak manfaat. Dengan semakin banyaknya dokumen teks yang dihasilkan di web dan arsip-arsip digital, Topic Discovery menjadi alat yang sangat penting untuk menelusuri, meringkas, dan mengelompokkan dokumen. Salah satu penerapan Association Rule Mining adalah digunakan untuk menemukan topik dalam suatu dokumen dengan cara mencari pola yang sering muncul pada semua dokumen. Data diambil dari IEEE Xplore yang merupakan kumpulan abstrak dari jurnal-jurnal di International Conference on Data Mining (ICDM) dan International Conference on Data Engineers (ICDE) dari tahun 2009-2018. Masing-masing abstrak direpresentasikan sebagai transaksi sedangkan kata keywords yang terkandung didalamnya direpresentasikan sebagai item. Kombinasi antar kata keywords yang paling sering muncul, yang disebut frequent itemset, akan digunakan sebagai kandidat dari suatu topik. Algoritma yang dapat digunakan untuk membangkitkan itemset adalah algoritma Apriori dan ECLAT. Waktu eksekusi perolehan frequent itemset dari ECLAT lebih cepat bila dibandingkan dengan Apriori. Closed frequent itemset juga mampu mengurangi frequent itemset yang terbentuk, sehingga Topik yang terbentuk merupakan Topik yang unik.
=================================================================================================================================
Finding topics from a collection of documents such as scientific publications has many benefits. With the increasing number of text documents produced on the web and digital archives, Topic Discovery is a very important tool for browsing, summarizing, and grouping documents. One of the application in Association Rule Mining is to find topics in a document by looking for patterns that often appear on all documents. Data was taken from IEEE Xplore which is a collection of abstracts from journals at the International Conference on Data Mining (ICDM) and the International Conference on Data Engineers (ICDE) from 2009-2018. Each abstract is represented as a transaction while the keyword words contained in it are represented as items. Combination of keywords that appear most often, called frequent itemset, will be used as a candidate for a topic. The algorithms that can be used to generating frequent itemset is the Apriori and ECLAT algorithms. The execution time for generating frequent itemset of ECLAT is faster than Apriori. Closed frequent itemset is able to reduce the frequent itemset that is formed, so the topic formed is a unique topic.
Item Type: | Thesis (Other) |
---|---|
Additional Information: | RSSt 006.32 Mus t-1 2019 |
Uncontrolled Keywords: | Apriori Algorithm, Association Rule, Closed Frequent Itemset, Eclat Algorithm, Network Analysis, Text Mining |
Subjects: | Q Science > QA Mathematics > QA76.9.D343 Data mining. Querying (Computer science) |
Divisions: | Faculty of Science and Data Analytics (SCIENTICS) > Statistics > 49201-(S1) Undergraduate Thesis |
Depositing User: | Reza Musofa |
Date Deposited: | 03 Dec 2024 03:12 |
Last Modified: | 03 Dec 2024 03:12 |
URI: | http://repository.its.ac.id/id/eprint/64570 |
Actions (login required)
View Item |