Pengembangan Pra-Proses Klasifikasi Chronic Kidney Disease Menggunakan K-Neearest Neighbors Imputer dan Chi-Square

Mardianto, Ricky (2025) Pengembangan Pra-Proses Klasifikasi Chronic Kidney Disease Menggunakan K-Neearest Neighbors Imputer dan Chi-Square. Masters thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 6025231043-Master_Thesis.pdf] Text
6025231043-Master_Thesis.pdf - Accepted Version
Restricted to Repository staff only until 1 April 2027.

Download (3MB) | Request a copy

Abstract

Chronic Kidney Disease (CKD) adalah kondisi di mana fungsi dan/atau struktur ginjal mengalami kerusakan parah sehingga tidak dapat menyaring darah sebagaimana seharusnya. Penyakit ini berkembang secara perlahan dan sulit dipulihkan. pada tahap awal CKD sering kali tidak menunjukkan gejala yang jelas, seringkali pasien tidak menyadarinya. Salah satu risiko utamanya adalah terjadinya komplikasi dan kematian. Pembelajaran mesin semakin popular dalam mendeteksi penyakit, termasuk CKD. Algoritma pembelajaran mesin membantu dalam mengidentifikasi dan memprediksi CKD tahap awal. Pendeteksian dini CKD dapat memberikan tindakan medis dan pengobatan yang tepat untuk mencegah risiko terhadap penyakit lainnya. Penelitian terbaru menunjukkan pendeteksian CKD terkendala karena seringkali data yang tidak valid dan memiliki banyak missing-value. Oleh karena itu, penanganan missing-value yang optimal pada data dan penggunaan seleksi fitur diharapkan dapat membantu meningkatkan kualitas prediksi dalam pendeteksian dini CKD. Penelitian ini melakukan penanganan missing-value menggunakan K Nearest Neighbour (KNN) imputer serta seleksi fitur berdasarkan penggunaan uji Chi-square pada dataset Chronic Kidney Disease dengan 400 sampel dan 25 fitur yang diambil dari Kaggle.com. Pembelajaran mesin yang digunakan yaitu Extra Tree Classifier, Random Forest, XGBoost dan deep learning yang digunakan yaitu TabNet dan TabTransformers. Dari hasil uji coba yang dilakukan menunjukkan bahwa hasil metode Extra Tree Classifier menghasilkan akurasi sebesar 99,25% lebih baik dari algoritma lain. Sehingga penanganan missing-value menggunakan KNN-imputer serta seleksi fitur berdasarkan penggunaan uji Chi-square merupakan penerapan yang baik untuk metode dalam mendeteksi dini CKD.
======================================================================================================================================
Chronic Kidney Disease (CKD) is a condition in which the function and/or structure of the kidneys are severely damaged so that they cannot filter blood as they should. This disease develops slowly and is difficult to recover. In the early stages of CKD, there are often no obvious symptoms, and patients are often unaware of it. One of the main risks is complications and death. Machine learning is increasingly popular in detecting diseases, including CKD. Machine learning algorithms help identify and predict early-stage CKD. Early detection of CKD can provide appropriate medical and treatment measures to prevent the risk of other diseases. Recent research shows that CKD detection is hampered by often invalid data and has many missing values. Therefore, optimal handling of missing values in data and the use of feature selection are expected to help improve the quality of predictions in early detection of CKD. This study handles missing values using the K Nearest Neighbor (KNN) imputer and feature selection based on the use of the Chi-square test on the Chronic Kidney Disease dataset with 400 samples and 25 features taken from Kaggle.com. The machine learning used is Extra Tree Classifier, Random Forest, XGBoost and the deep learning used is TabNet and TabTransformers. The results of the trials showed that the results of the Extra Tree Classifier method produced an accuracy of 99.25% better than other algorithms. So that handling missing-value using KNN-imputer and feature selection based on the use of the Chi-square test is a good application for the method in early detection of CKD

Item Type: Thesis (Masters)
Uncontrolled Keywords: Chi-square, Chronic Kidney Disease, deep learning, KNN-Imputer, machine learning, missing-value, Chi-square, Chronic Kidney Disease, deep learning, KNN-Imputer, machine learning, missing-value
Subjects: T Technology > T Technology (General) > T57.5 Data Processing
Divisions: Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Informatics Engineering > 55101-(S2) Master Thesis
Depositing User: Ricky Mardianto
Date Deposited: 06 Feb 2025 06:45
Last Modified: 06 Feb 2025 06:45
URI: http://repository.its.ac.id/id/eprint/118415

Actions (login required)

View Item View Item