Text Clustering untuk Penentuan Topik Berita Online Mengenai Kota Surabaya dengan Metode K-Means dan Self-Organizing Maps

Leviany, Fonda (2019) Text Clustering untuk Penentuan Topik Berita Online Mengenai Kota Surabaya dengan Metode K-Means dan Self-Organizing Maps. Other thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 06211540000015-Undergraduate_Theses.pdf]
Preview
Text
06211540000015-Undergraduate_Theses.pdf

Download (2MB) | Preview

Abstract

Berita memberikan informasi mengenai peristiwa yang terjadi sehingga sampai di telinga masyarakat. Salah satu situs berita online yang memberikan informasi mengenai Kota Surabaya dan sekitarnya adalah SURYA.co.id yang beralamatkan http://surabaya.tribunnews.com/. Situs ini merupakan versi digital dari Koran Harian Surya yang pada Februari 2019 memperoleh penghar-gaan sebagai Surat Kabar Terbaik Regional Jawa versi IPMA. Berita yang dipublikasikan melalui situs ini diharapkan telah terkategorisasi dengan baik sehingga masyarakat Kota Surabaya dapat memperoleh informasi yang dicari dengan lebih cepat. Namun, fitur kategorisasi berita mengenai Kota Surabaya belum tersedia. Penelitian ini di-harapkan mampu memberikan manfaat bagi masyarakat, pemerintah, dan pihak manajemen Tribunnews Surabaya. Korpus berita selama tahun 2018 yang diperoleh akan melewati tahap text pre-processing, tokenizing, feature selection, dan clustering menggunakan K-Means dan Self-Organizing Maps. Tahap tokenizing dilakukan dengan pendekatan unigram dan bigrams. Berdasarkan hasil evaluasi dengan average silhouette width diperoleh hasil bahwa metode K-Means memberikan hasil clustering lebih baik daripada Self-Organizing Maps dengan jumlah cluster optimum sebanyak 10 cluster. Topik berita yang sering dibahas selama tahun 2018 adalah adalah pelecehan seksual, kereta api, Universitas Airlangga, Kepolisian Kabupaten/Kota, Narkoba, RSUD dr. Soetomo, Kepolisian Daerah, Pelabuhan Tanjung Perak, pendidikan, serta hiburan, kriminalitas, peristiwa penting, dan lain-lain.
================================================================================================================================
News provides information to the people about events that occured. One of the online news sites that provide information about the Surabaya City and around is SURYA.co.id which has the complete URL http://surabaya. tribunnews.com/. This site is the digital version of Surya Daily Newspaper which got the achievement on February 2019 as The Best Newspaper in Java Region by IPMA. The news published through this site is expected to have been well categorized so that the citizen of Surabaya City can obtain information faster. However, the news categorization feature regarding the Surabaya City is not yet available. This research is expected to be able to provide benefits to the citizens, government, and the management of the Surabaya Tribunnews. The news corpus during 2018 year, will be processed which start from text pre-processing, feature selection, and text clustering using K-Means and Self-Organizing Maps. The tokenizing phase is carried out with unigram and bigrams approach. Based on the evaluation results by average silhouette width, the K-Means method gives better clustering results than Self-Organizing Maps with the optimum number of clusters is 10 clusters. The news topics that are often discussed during 2018 are sexsual harassment,, trains, Airlangga University, city police, drugs, RSUD dr. Soetomo, regional police, Tan-jung Perak Harbour, education, and also about entertainment, crimes, breaking news, and etc.

Item Type: Thesis (Other)
Additional Information: RSSt 519.53 Lev t-1 2019
Uncontrolled Keywords: Berita Online, K-Means, N-Gram, Self-Organizing Maps, Text Clustering
Subjects: Q Science > QA Mathematics > QA278.55 Cluster analysis
Q Science > QA Mathematics > QA76.9.D343 Data mining. Querying (Computer science)
Q Science > QA Mathematics > QA9.58 Algorithms
Divisions: Faculty of Mathematics, Computation, and Data Science > Statistics > 49201-(S1) Undergraduate Thesis
Depositing User: Fonda Leviany
Date Deposited: 29 May 2023 01:35
Last Modified: 29 May 2023 01:35
URI: http://repository.its.ac.id/id/eprint/63904

Actions (login required)

View Item View Item