Klasifikasi Multi-Label Pada Artikel Jurnal ScienceDirect Dengan Metode K-Nearest Neighbor (KNN) Dan Support Vector Machine (SVM)

Rabbanie, Rifqi (2020) Klasifikasi Multi-Label Pada Artikel Jurnal ScienceDirect Dengan Metode K-Nearest Neighbor (KNN) Dan Support Vector Machine (SVM). Other thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 06211640000096-Undergraduate_Thesis.pdf]
Preview
Text
06211640000096-Undergraduate_Thesis.pdf

Download (1MB) | Preview

Abstract

ScienceDirect adalah platform online yang menyediakan akses terbitan yang terbanyak di dunia. Jurnal yang diterbitkan oleh ScienceDirect dikelompokkan menjadi empat bidang, yaitu Physical Sciences and Engineering, Life Sciences, Health Sciences, dan Social Sciences and Humanities. Meskipun begitu, ScienceDirect masih belum mengelompokkan artikel jurnal yang tersimpan di platform miliknya. Selain itu, sangat memungkinkan artikel jurnal yang disimpan termasuk ke dalam dua bidang atau lebih yang berbeda, sehingga perlu untuk diklasifikasikan secara multi-label. Data yang digunakan dalam penelitian ini berupa abstrak dari artikel jurnal dengan kata kunci “Data Mining” dan setiap jurnal diklasifikasikan berdasarkan abstrak tersebut. Sebelum data diklasifikasikan, terlebih dulu melalui tahapan pre-processing pada data teks abstrak tersebut, seperti proses case folding, delete punctuation, remove number, tokenization, remove stopwords, dan lemmatization. Setelah itu data tersebut diklasifikasikan secara multi-label dengan pendekatan problem transformation Label Powerset, yaitu dengan mentransformasi data kategori setiap artikel jurnal yang semula multi-label menjadi multi-class yang selanjutnya diklasifikasikan dengan KNN dan SVM. Kinerja klasifikasi KNN dan SVM diukur dengan nilai hamming loss dan didapatkan kesimpulan bahwa berdasarkan nilai hamming loss, SVM memberikan hasil ketepatan klasifikasi yang lebih baik jika dibandingkan dengan KNN.

======================================================================================================================

ScienceDirect is an online platform that provides the most publications in the world. Journals published by ScienceDirect are grouped into four fields, namely Physiology and Engineering, Life Sciences, Health Sciences, and Social Sciences and Humanities. Even so, ScienceDirect still hasn't grouped journal articles stored on its platform. Also, it is very possible for saved article entries can include in two or more different fields, so it needs to be classified as multi-labeled. The data used in this study is in the form of abstracts from the meanings of journals with the keyword "Data Mining" and each journal is classified based on the abstract. Before the data is classified, it first goes through the pre-processing steps in the abstract text data, such as the case of folding, delete punctuation, remove number, tokenization, remove stopwords, and lemmatization. After that, the data is classified in a multi-label manner with the Label Powerset problem transformation approach, namely by transforming the data categories of each journal article that was originally multi-label into a multi-class which is then classified with KNN and SVM. The KNN and SVM classification performance are measured by the value of hamming loss and it can be concluded that based on the value of a hamming loss, SVM gives better classification results when compared with KNN

Item Type: Thesis (Other)
Additional Information: RSSt 519.53 Rab k-1 • Rabbanie, Rifqi
Uncontrolled Keywords: Grid Search, Hamming Loss, Label Powerset, Multi-Label, Support Vector Machine, Grid Search, Hamming Loss, Label Powerset, Multi-Label, Support Vector Machine.
Subjects: H Social Sciences > HD Industries. Land use. Labor > HD108 Classification (Theory. Method. Relation to other subjects )
Divisions: Faculty of Mathematics and Science > Statistics > 49201-(S1) Undergraduate Thesis
Depositing User: Rifqi Rabbanie
Date Deposited: 26 Aug 2020 03:10
Last Modified: 21 Dec 2023 08:14
URI: http://repository.its.ac.id/id/eprint/81154

Actions (login required)

View Item View Item