A Performance Evaluation of Classifiers Employ Language Dependent Tools for Indonesian Text

Sumpeno, Surya and Arifin, Agus Zainal and Hariadi, Mochamad and Purnomo, Mauridhi Hery (2010) A Performance Evaluation of Classifiers Employ Language Dependent Tools for Indonesian Text. In: Scientific Article of 11Th Seminar On Intelligent Technology and ITS Application (SITIA 2010).


Download (1MB) | Preview


This paper evaluates the performance of Maximum Entropy (MaxEnt), Support Vector Machine (SVM) and Na¨ıve Bayes (NB) techniques for Indonesian text classification. Performance of MaxEnt and SVM techniques are compared against baseline NB technique. We also investigate the effect of language dependent tools such as Indonesian stemming and stop words removal can have on these techniques for text classification performances. Up to now, there is no experimental report about the effect of Indonesian stemmer on the text classification accuracy. From our experiments, we conclude that maximum entropy performs better than other classifiers in general. Language dependent tools such as stemming and stop words removal have only little effect on the accuracy of text classification. However stemmed approach scored highest average accuracy and due to the dimension reduction of feature vectors used in classification, make this approach is viable step in pre-processing stage.

Item Type: Conference or Workshop Item (Paper)
Additional Information: Collection ID : 51105120000032 Call Number : 005.13 Sum p
Uncontrolled Keywords: maximum entropy, support vector machine, na¨ıve bayes, indonesian text classification, language dependent tool, stopwords removal, stemming
Subjects: Q Science > Q Science (General) > Q370 Entropy (Information theory)
Divisions: Faculty of Industrial Technology > Electrical Engineering
Depositing User: - Davi Wah
Date Deposited: 08 Oct 2019 03:06
Last Modified: 08 Oct 2019 03:06
URI: https://repository.its.ac.id/id/eprint/71038

Actions (login required)

View Item View Item