Samosir, Immanuel Pascanov (2025) Deteksi Penipuan Pada Informasi Rekrutmen Pekerjaan Menggunakan XGBoost Dengan Oversampling SMOTE. Other thesis, Institut Teknologi Sepuluh Nopember.
![]() |
Text (Tugas Akhir Immanuel Pascanov Samosir)
5025211257_Immanuel Pascanov Samosir_Buku TA.pdf - Accepted Version Restricted to Repository staff only Download (5MB) | Request a copy |
Abstract
Penipuan dalam rekrutmen kerja merupakan permasalahan yang kian meningkat, terutama pada platform daring, di mana iklan lowongan palsu dimanfaatkan untuk menipu pelamar kerja. Penelitian ini mengusulkan solusi deteksi penipuan dengan memanfaatkan algoritma XGBoost yang dikombinasikan dengan teknik oversampling SMOTE untuk menangani ketidakseimbangan kelas dalam data. Proses pengujian dilakukan melalui tiga skenario, yaitu tanpa SMOTE, dengan SMOTE, dan tanpa feature engineering. Dataset yang digunakan adalah Employment Scam Aegean Dataset (EMSCAD) yang terdiri dari 17.880 lowongan kerja (866 palsu dan 17.014 asli), dengan 18 fitur dan telah melalui proses pembersihan, rekayasa fitur boolean dan numerik, serta seleksi fitur menggunakan Information Gain. Evaluasi model dilakukan dengan metrik Accuracy, Precision, Recall, F1-Score, dan ROC AUC. Hasil terbaik diperoleh pada skenario ketiga (tanpa feature engineering) dengan model XGBoost yang hanya menggunakan empat fitur terpilih, dan mencapai nilai Recall sebesar 0,958, F1-Score 0,884, serta ROC AUC 0,988. Temuan ini menegaskan bahwa pemilihan fitur yang tepat, meskipun tanpa rekayasa fitur tambahan, tetap mampu meningkatkan efektivitas deteksi penipuan lowongan kerja.
============================================================
Job recruitment fraud is an increasingly prevalent issue, especially on online platforms where fake job advertisements are used to deceive job seekers. This study proposes a fraud detection solution using the XGBoost algorithm combined with the SMOTE oversampling technique to address class imbalance in the data. Three experimental scenarios were conducted: without SMOTE, with SMOTE, and without feature engineering. The dataset used is the Employment Scam Aegean Dataset (EMSCAD), consisting of 17,880 job postings (866 fraudulent and 17,014 legitimate), with 18 features. It was preprocessed, enhanced with engineered boolean and numerical features, and subjected to feature selection using Information Gain. The models were evaluated using Accuracy, Precision, Recall, F1-Score, and ROC AUC. The best result was obtained in the third scenario (without feature engineering) using XGBoost with only four selected features, achieving a Recall of 0.958, F1-Score of 0.884, and ROC AUC of 0.988. These findings highlight the importance of proper feature selection in enhancing fraud detection performance, even without additional feature engineering.
Item Type: | Thesis (Other) |
---|---|
Uncontrolled Keywords: | Deteksi Penipuan, Rekrutmen Pekerjaan, XGBoost, SMOTE, Fraud Detection, Job Recruitment. |
Subjects: | T Technology > T Technology (General) > T57.6 Operations research--Mathematics. Goal programming T Technology > T Technology (General) > T57.8 Nonlinear programming. Support vector machine. Wavelets. Hidden Markov models. T Technology > T Technology (General) > T58.62 Decision support systems |
Divisions: | Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Artificial Intelligence Engineering > 55283-(S1) Undergraduate Thesis |
Depositing User: | Immanuel Pascanov Samosir |
Date Deposited: | 04 Aug 2025 11:16 |
Last Modified: | 04 Aug 2025 11:16 |
URI: | http://repository.its.ac.id/id/eprint/125351 |
Actions (login required)
![]() |
View Item |