Analisis Faktor Karakteristik Rumah Dan Pemodelan Prediksi Harga Rumah Menggunakan Metode Machine Learning

Nathalia, Minoque Kusuma Salma Rasyid Jr. (2026) Analisis Faktor Karakteristik Rumah Dan Pemodelan Prediksi Harga Rumah Menggunakan Metode Machine Learning. Masters thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 6032241127-Master_Thesis.pdf] Text
6032241127-Master_Thesis.pdf - Accepted Version
Restricted to Repository staff only

Download (4MB) | Request a copy

Abstract

Prediksi harga rumah merupakan komponen penting dalam mendukung pengambilan keputusan di industri properti, terutama bagi agen yang harus menentukan harga jual yang kompetitif namun tetap realistis agar tidak menimbulkan risiko finansial bagi penjual maupun pembeli. Meskipun metode machine learning telah banyak digunakan dalam penelitian sebelumnya, sebagian besar studi masih mengandalkan dataset publik yang tidak selalu mencerminkan kondisi pasar aktual. Penelitian ini menerapkan metode machine learning untuk memprediksi harga properti menggunakan data primer yang diperoleh dari agen properti di Surabaya yang dikumpulkan pada periode 2022–2025. Tahap preprocessing mencakup feature engineering, penanganan nilai hilang, penghapusan outlier, serta encoding variabel kategorikal. Penelitian ini mengevaluasi sepuluh konfigurasi model dengan mengombinasikan Random Forest dan XGBoost pada tiga skenario fitur: seluruh fitur, fitur hasil seleksi mutual information, dan fitur yang diperoleh melalui forward stepwise selection. Hyperparameter dioptimalkan menggunakan Optuna, sementara performa model dinilai menggunakan RMSE, MAE, MAPE, dan R² melalui 5-fold cross-validation. Hasil penelitian menunjukkan bahwa model dengan performa terbaik adalah Random Forest yang telah di-tuning dan menggunakan 24 fitur hasil seleksi stepwise, dengan nilai RMSE sebesar ±IDR 267,9 juta, MAE ±IDR 205,5 juta, MAPE 23,7%, dan R² sebesar 0,72. Seleksi fitur tidak meningkatkan performa XGBoost, namun pada Random Forest justru menghasilkan model yang lebih ringkas tanpa menurunkan akurasi prediksi.
======================================================================================================================================
House price prediction plays a crucial role in supporting decision-making within the real-estate industry, particularly for agents who must determine selling prices that remain competitive yet realistic to minimize financial risks for both sellers and buyers. Although machine learning has been increasingly applied in property valuation research, many existing studies rely on public datasets that do not fully represent real market conditions. This study develops a systematic machine learning pipeline for predicting residential property prices using a primary dataset obtained from a real-estate agency in Surabaya collected between 2022 and 2025. Extensive preprocessing procedures were conducted, including feature engineering, missing value handling, standardization of inconsistent entries, outlier removal, temporal feature extraction, and categorical encoding. Ten model configurations were evaluated by combining Random Forest and XGBoost under three feature scenarios: all features, mutual-information-based selection, and forward stepwise selection. Hyperparameters were optimized using Optuna, and model performance was assessed using RMSE, MAE, MAPE, and R² through 5-fold cross-validation. The best-performing model was a tuned Random Forest trained on 24 stepwise-selected features, achieving an RMSE of approximately IDR 267.9M, MAE of IDR 205.5M, MAPE of 23.7%, and an R² of 0.72. While feature selection did not improve the performance of XGBoost, it produced a more compact and efficient Random Forest model without compromising predictive accuracy.

Item Type: Thesis (Masters)
Uncontrolled Keywords: prediksi harga rumah, machine learning, feature engineering, hyperparameter tuning, house price prediction
Subjects: Q Science > QA Mathematics > QA278.2 Regression Analysis. Logistic regression
T Technology > T Technology (General) > T57.8 Nonlinear programming. Support vector machine. Wavelets. Hidden Markov models.
Divisions: Interdisciplinary School of Management and Technology (SIMT) > 61101-Master of Technology Management (MMT)
Depositing User: Nathalia Minoque Kusuma Salma Rasyid Jr.
Date Deposited: 28 Jan 2026 03:47
Last Modified: 28 Jan 2026 03:47
URI: http://repository.its.ac.id/id/eprint/130726

Actions (login required)

View Item View Item