Efendi, Muchni Illahi (2026) Pemilihan Bandwidth Optimal Pada Regresi Nonparametrik Kernel Menggunakan Metode Generalized Cross-Validation (GCV) dan Unbiased Risk (UBR). Masters thesis, Institut Teknologi Sepuluh Nopember.
|
Text
6003241050-Master_Thesis.pdf - Accepted Version Restricted to Repository staff only Download (4MB) | Request a copy |
Abstract
Regresi nonparametrik merupakan pendekatan statistik yang fleksibel karena tidak memerlukan asumsi bentuk fungsi tertentu antara variabel respon dan prediktor. Regresi kernel yang merupakan metode estimator yang memiliki bentuk yang lebih fleksibel, baik dalam memodelkan data yang tidak memiliki pola tertentu, selain itu perhitungan matematisnya mudah. Estimasi dengan pendekatan kernel tergantung pada dua parameter yaitu bandwidth dan fungsi kernel. Namun, estimasi dengan pendekatan kernel yang terpenting adalah pemilihan nilai bandwidth optimal, karena akan sangat mempengaruhi kurva regresi yang terbentuk. Berbagai metode telah dikembangkan untuk menentukan bandwidth optimum, antara lain Cross-Validation (CV), Generalized Cross-Validation (GCV), dan Unbiased Risk (UBR). Penelitian ini bertujuan untuk mengkaji metode GCV dan UBR dalam memilih bandwidth optimal pada model regresi nonparametrik kernel, membandingkan kinerja pemilihan titik knot dan bandwidth menggunakan metode GCV dan UBR pada data simulasi, serta membandingkan kedua metode tersebut dalam penerapannya terhadap data laju pertumbuhan ekonomi di Pulau Jawa tahun 2024. Estimator kernel yang digunakan adalah estimator Nadaraya Watson. Kajian simulasi dilakukan dengan membangkitkan fungsi eksponensial dengan error yang mengikuti distribusi Normal, serta pada kombinasi ukuran sampel dan varians. Berdasarkan hasil analisis, kajian simulasi menunjukkan metode GCV memberikan hasil serta ketepatan yang lebih baik untuk setiap kombinasi ukuran sampel dan variasi varians dalam memilih bandwidth optimal. Hasil aplikasi data menunjukkan hal yang sama, didapatkan nilai MSE pada data out-sample metode GCV sebesar 0,08213 dimana nilai tersebut lebih kecil dari nilai MSE pada data out-sample metode UBR yaitu 0,08528. Artinya, metode GCV merupakan metode terbaik untuk pemilihan bandwidth optimal pada regresi nonparametrik estimator kernel multivariabel dengan dengan bandwidth optimal tiap prediktor adalah h_1=0,97143, h_2=1,08571, dan h_3=0,71429.
=======================================================================================================================================
Nonparametric regression is a flexible statistical approach that does not require assumptions about the functional form between the response and predictor variables. Kernel regression is one of the nonparametric estimation methods that offers high flexibility in modeling data with unknown or irregular patterns, while also having relatively simple mathematical computation. Kernel-based estimation depends on two main parameters, namely the bandwidth and the kernel function. However, the most crucial aspect in kernel estimation is the selection of the optimal bandwidth, as it greatly influences the shape of the resulting regression curve. Various methods have been developed to determine the optimal bandwidth, including Cross-Validation (CV), Generalized Cross-Validation (GCV), and Unbiased Risk (UBR). This study aims to examine the GCV and UBR methods in selecting the optimal bandwidth for kernel nonparametric regression models, to compare the performance of bandwidth selection using GCV and UBR through simulation studies, and to evaluate both methods in their application to economic growth rate data across regencies and cities in Java Island in 2024. The kernel estimator employed in this study is the Nadaraya–Watson estimator. The simulation study was conducted by generating an exponential function with normally distributed errors under various combinations of sample sizes and error variances. Based on the results, the simulation study indicates that the GCV method provides more accurate and consistent performance across all combinations of sample sizes and error variances in selecting the optimal bandwidth. The empirical application yields similar results, where the out-of-sample MSE obtained using the GCV method is 0.08213, which is smaller than the out-of-sample MSE produced by the UBR method, namely 0.08528. This indicates that the GCV method is more effective for selecting the optimal bandwidth in multivariate kernel nonparametric regression. The optimal bandwidths for each predictor are obtained as h_1=0.97143, h_2=1.08571, and h_3=0.71429
| Item Type: | Thesis (Masters) |
|---|---|
| Uncontrolled Keywords: | Generalized Cross-Validation, Kernel, Laju Pertumbuhan Ekonomi, Regresi Nonparametrik, dan Unbiassed Risk. ======================================================================================================================== Generalized Cross-Validation, Kernel, Economic Grow Rate, Nonparametric Regression, and Unbiased Risk. |
| Subjects: | Q Science > QA Mathematics > QA278.2 Regression Analysis. Logistic regression Q Science > QA Mathematics > QA353.K47 Kernel functions (analysis) |
| Divisions: | Faculty of Mathematics and Science > Statistics > 49101-(S2) Master Thesis |
| Depositing User: | Muchni Illahi Efendi |
| Date Deposited: | 27 Jan 2026 06:51 |
| Last Modified: | 27 Jan 2026 06:51 |
| URI: | http://repository.its.ac.id/id/eprint/130512 |
Actions (login required)
![]() |
View Item |
