Otomasi Pipeline Data Warehouse Menggunakan Apache Airflow dan Strategi Incremental Sync pada PT Semen Indonesia Logistik

Fakhir, Mufrih and Umara, Rafie Zaidan (2026) Otomasi Pipeline Data Warehouse Menggunakan Apache Airflow dan Strategi Incremental Sync pada PT Semen Indonesia Logistik. Project Report. [s.n.], [s.l.]. (Unpublished)

[thumbnail of 5025231245_5025231248-Project_Report.pdf] Text
5025231245_5025231248-Project_Report.pdf - Accepted Version
Restricted to Repository staff only

Download (2MB) | Request a copy

Abstract

PT Semen Indonesia Logistik (SILOG) menghadapi tantangan dalam mengelola volume data operasional yang besar dan terfragmentasi pada sistem kantor yang mencakup 88 tabel operasional. Proses pengolahan data secara manual menimbulkan risiko inkonsistensi, keterlambatan informasi, serta kurangnya konteks data eksternal yang relevan bagi operasional logistik. Penelitian ini bertujuan untuk merancang dan mengimplementasikan sistem "Pabrik Data Otomatis", sebuah platform integrasi data terpusat berbasis metodologi ELT (Extract, Load, Transform). Sistem ini dikembangkan menggunakan Apache Airflow sebagai orkestrator utama yang mengelola seluruh siklus hidup data dalam lingkungan container (Docker/Podman). Fitur utama yang diimplementasikan meliputi mekanisme sinkronisasi incremental untuk efisiensi transfer data, fungsi soft delete pada 21 tabel transaksional untuk menjaga integritas histori audit, serta pengayaan data melalui integrasi API BMKG (prakiraan cuaca) dan GraphHopper (metrik rute). Selain itu, sistem ini menerapkan teknik metadata caching dan connection reuse untuk mengoptimalkan performa eksekusi. Hasil implementasi menunjukkan peningkatan efisiensi yang signifikan, di mana waktu sinkronisasi berhasil ditekan hingga mencapai durasi 5–10 detik per sesi incremental, atau 5 hingga 10 kali lebih cepat dibandingkan metode konvensional. Data yang telah diolah dalam lapisan analitik berhasil divisualisasikan melalui Metabase, memberikan antarmuka pemantauan yang interaktif bagi pihak manajemen. Kesimpulannya, sistem "Pabrik Data Otomatis" berhasil menyediakan fondasi data yang stabil, skalabel, dan andal untuk mendukung pengambilan keputusan berbasis data di PT Semen Indonesia Logistik.
=====================================================================================================================================
PT Semen Indonesia Logistik (SILOG) faces challenges in managing large and fragmented volumes of operational data across its office systems, which include 88 operational tables. Manual data processing poses risks of inconsistencies, delays in information, and a lack of external data context relevant to logistics operations. This research aims to design and implement an “Automated Data Factory” system, a centralized data integration platform based on the ELT (Extract, Load, Transform) methodology. This system was developed using Apache Airflow as the primary orchestrator that manages the entire data lifecycle within a container environment (Docker/Podman). Key features implemented include an incremental synchronization mechanism for efficient data transfer, a soft-delete function on 21 transactional tables to maintain audit trail integrity, and data enrichment through integration with the BMKG (weather forecast) and GraphHopper (route metrics) APIs. Additionally, the system employs metadata caching and connection reuse techniques to optimize execution performance.Implementation results demonstrate a significant efficiency improvement, with synchronization time reduced to 5–10 seconds per incremental session—5 to 10 times faster than conventional methods. Data processed in the analytics layer was successfully visualized through Metabase, providing an interactive monitoring interface for management. In conclusion, the “Automated Data Factory” system successfully provided a stable, scalable, and reliable data foundation to support data-driven decision-making at PT Semen Indonesia Logistik.

Item Type: Monograph (Project Report)
Uncontrolled Keywords: Apache, Airflow, Data Warehouse, ELT, Incremental Sync, Logistik, Soft Delete, Visualisasi Analitik. Apache, Airflow, Data Warehouse, ELT, Incremental Sync, Logistics, Soft Delete, Analytics Visualization.
Subjects: T Technology > T Technology (General)
Divisions: Faculty of Information Technology > Informatics Engineering > 55201-(S1) Undergraduate Thesis
Depositing User: Rafie Zaidan Umara
Date Deposited: 17 Jun 2026 02:27
Last Modified: 17 Jun 2026 02:27
URI: http://repository.its.ac.id/id/eprint/133825

Actions (login required)

View Item View Item