Topic Modeling Pada Ulasan Hotel Menggunakan Latent Dirichlet Allocation (LDA) Dan Probabilistic Latent Semantic Analysis (PLSA)

Mustahidah, Mita (2021) Topic Modeling Pada Ulasan Hotel Menggunakan Latent Dirichlet Allocation (LDA) Dan Probabilistic Latent Semantic Analysis (PLSA). Undergraduate thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 06211740000053_Undergraduate_Thesis.pdf]
Preview
Text
06211740000053_Undergraduate_Thesis.pdf - Accepted Version

Download (1MB) | Preview

Abstract

Situasi pandemi Covid-19 telah mendorong hotel bintang empat di Kota Bandung, yaitu Gino Feruci Braga, Golden Flower, dan Belviu Hotel untuk terus melakukan peningkatan kualitas layanan hotel. Untuk mengetahui peningkatan pelayanan yang sesuai dengan tamu hotel, maka perlu dilakukan pemahaman kebutuhan tamu hotel melalui ulasan yang diberikan tamu hotel. Ulasan berbentuk teks yang berjumlah sangat banyak telah menjadi tantangan tersendiri bagi tiap hotel. Penelitian ini bertujuan untuk memprediksi sentimen positif dan negatif tamu hotel dari ulasan yang diberikan menggunakan metode Naïve Bayes Classifier (NBC) dan mendapatkan topik yang dibahas dari kelas sentimen pada tiap hotel menggunakan Latent Dirichlet Allocation (LDA) dan Probabilistic Latent Semantic Analysis (PLSA). Data yang digunakan pada penelitian ini adalah data ulasan tamu hotel dari situs Tripadvisor. Tahapan penelitian ini dimulai dengan pengumpulan data, praproses teks, pembobotan kata dengan Term Frequency Inverse Document Frequency (TF-IDF), analisis sentimen dengan NBC yang disertakan teknik SMOTE, pengukuran ketepatan klasifikasi, validasi jumlah topik dengan topic coherence, pemodelan topik dengan LDA dan PLSA, dan evaluasi pemodelan topik menggunakan Calinski Harabasz. Hasil penelitian ini yaitu mengetahui topik dalam data ulasan positif dan negatif pada tiap hotel menggunakan metode LDA dan PLSA. Berdasarkan nilai Calinski Harabasz dapat disimpulkan bahwa metode LDA memiliki performa yang lebih baik daripada metode PLSA.
====================================================================================================
The Covid-19 pandemic situation has encouraged four-star hotels in the city of Bandung, namely Gino Feruci Braga, Golden Flower, and Belviu Hotel to continue to improve the quality of hotel services. In order to find out the improvement of services in accordance with hotel guests, it is necessary to understand the needs of hotel guests through reviews given by hotel guests. The large number of reviews has become a challenge for every hotel. This study aims to predict the positive and negative sentiments of hotel guests from the reviews given using the Naïve Bayes Classifier (NBC) method and get the topics discussed from class sentiment in each hotel using Latent Dirichlet Allocation (LDA) and Probabilistic Latent Semantic Analysis (PLSA). The data used in this study is hotel review data from the Tripadvisor website. The research stages began with data collection, text preprocessing, word weighting using Term Frequency Inverse Document Frequency (TF IDF), sentiment analysis using NBC which included SMOTE technique, classification accuracy, validate the number of topics using topic coherence, topic modeling using LDA and PLSA, and evaluation of topic modeling using Calinski Harabasz. The results of this study are knowing the topics in the positive and negative review data at each hotel using the LDA and PLSA methods. Based on the Calinski Harabasz value, it can be concluded that the LDA method has better performance than the PLSA method.

Item Type: Thesis (Undergraduate)
Uncontrolled Keywords: Calinski Harabasz, LDA, NBC, PLSA, Praproses Teks, SMOTE, TF-IDF, Topic Coherence, Text Preprocessing
Subjects: Q Science > Q Science (General) > Q325.5 Machine learning. Support vector machines.
Divisions: Faculty of Science and Data Analytics (SCIENTICS) > Statistics > 49201-(S1) Undergraduate Thesis
Depositing User: Mita Mustahidah
Date Deposited: 07 Sep 2021 04:19
Last Modified: 28 Jun 2024 00:23
URI: http://repository.its.ac.id/id/eprint/91644

Actions (login required)

View Item View Item