HARYONO, SUKO (2017) BAYESIAN NETWORK UNTUK KLASIFIKASI RUMAH TANGGA MISKIN DI ACEH. Masters thesis, Institut Teknologi Sepuluh Nopember Surabaya.
Text
1315201714-Master_ thesis.pdf - Published Version Restricted to Repository staff only Download (2MB) | Request a copy |
Abstract
Salah satu aspek penting dalam mendukung upaya pemerintah dalam kebijakan pengentasan kemiskinan adalah penyediaan data yang akurat terkait rumah tangga miskin. Penelitian ini bertujuan melakukan klasifikasi rumah tangga miskin berdasarkan karakteristiknya yang multidimensi, seperti sosial, ekonomi, demografi, dan lain-lain. Rumah tangga terkategori miskin jika memiliki pengeluaran di bawah garis kemiskinan.
Model regresi logistik dan Bayesian network digunakan dalam pemodelan klasifikasi rumah tangga miskin di Aceh. Regresi logistik (RL) sudah banyak digunakan dalam pemodelan klasifikasi termasuk klasifikasi rumah tangga miskin. Sayangnya, metode RL sering mengalami under fitting dalam menangani masalah data yang imbalance. Padahal dalam kasus kemiskinan, kondisi imbalance akan sering terjadi. Hal ini karena jumlah rumah tangga miskin jauh lebih sedikit dibandingkan dengan rumah tangga tidak miskin.
Naïve Bayes (NB) merupakan salah satu model Bayesian network yang sudah banyak diaplikasikan dalam pemodelan klasifikasi. Dalam berbagai penelitian NB menunjukkan ketepatan klasifikasi yang lebih baik daripada RL. Model NB dibangun berdasarkan asumsi seluruh variabel atribut saling independen given variabel kelas. Dalam bidang-bidang sosial masalah dependensi antar variabel atribut sangat sering terjadi apalagi jika melibatkan variabel yang banyak. Untuk mengatasinya, maka digunakan dua model Bayesian network lainnya, yaitu Tree Augmented Naïve Bayes (TAN) dan Hierarchical Naïve Bayes (HNB). Dua metode ini dapat mengatasi permasalahan adanya saling dependensi (hubungan) antar atribut dengan baik. Berdasarkan hal tersebut di atas, penelitian ini melakukan komparasi tingkat akurasi dari RL, NB, dan TAN, serta HNB dalam pemodelan klasifikasi rumah tangga miskin di Aceh. Hasil penelitian menunjukkan performa model Bayesian network lebih baik jika dibandingkan dengan regresi logistik. Hal ini dilihat dari segi sensitivity, G-measure dan F-measure. Regresi logistik lebih cenderung mengklasifikasikan suatu rumah tangga sebagai rumah tangga tidak miskin.
=========================================================================
One important aspect to support the government's efforts in poverty alleviation policy is the provision of accurate data related to poor households. This study aims to classify poor households based on the multidimensional characteristics, such as social, economic, demographic, and others. Households was categorized as poor if they had expenditures below the poverty line. Therefore, the categorization of households will be made based on poverty line approach and clustering with Ward’s method.
Logistic regression and Bayesian networks will be applied in the classification of poor households in Aceh. Logistic regression (RL) has been widely used classification including classification of poor households. Unfortunately, this method is often suffered from underfitting in addressing the problem of imbalance data. Whereas in the case of poverty, imbalance condition will often occur i.e the number of poor households is much less than the non-poor households.
Naïve Bayes (NB) is one of the Bayesian networks models that has been widely applied in classification. In various studies NB show that classification accuracy is better than RL. NB model is built on the assumption of all attribute variables mutually independent given a class variable. In the fields of social domains dependencies between attribute variables are very common especially if it involves many variables. Such conditions may reduce the performance of the model. To overcome this problem, two others Bayesian network models, Tree Augmented Naïve Bayes (TAN) and Hierarchical Naïve Bayes (HNB) are applied. These two methods can solve the problem of mutual dependencies between the attributes well. Based on the above description, this study compared the accuracy of the RL, NB, and TAN, and HNB in modeling the classification of poor households in Aceh. Results of this study proved that Bayesian network is more competitive than logistic regression in classifiying households based on sensitivity, G-measure and F-measure metrics. Logistic regression tends to classify household in non poor status.
Item Type: | Thesis (Masters) |
---|---|
Uncontrolled Keywords: | Bayesian network, klasifikasi, regresi logistik, rumah tangga miskin, Bayesian networks, classification, logistic regression, poor household |
Subjects: | H Social Sciences > HA Statistics Q Science > QA Mathematics > QA278.2 Regression Analysis. Logistic regression |
Divisions: | Faculty of Mathematics and Science > Statistics > 49101-(S2) Master Thesis |
Depositing User: | - SUKO HARYONO |
Date Deposited: | 24 Jan 2017 09:04 |
Last Modified: | 24 Jan 2017 09:04 |
URI: | http://repository.its.ac.id/id/eprint/3047 |
Actions (login required)
View Item |