Klasifikasi Jual/Beli Saham Berdasarkan Fusi Data Historis dan Sentimen Berita dengan Menggunakan Machine Learning

Hamdika, Abidjanna Zulfa (2024) Klasifikasi Jual/Beli Saham Berdasarkan Fusi Data Historis dan Sentimen Berita dengan Menggunakan Machine Learning. Diploma thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 5025201197-Undergraduate_Thesis.pdf] Text
5025201197-Undergraduate_Thesis.pdf - Accepted Version
Restricted to Repository staff only until 1 October 2026.

Download (7MB) | Request a copy

Abstract

Harga saham setiap hari selalu mengalami pergerakan baik itu kenaikan maupun penurunan. Pergerakan harga saham disebabkan oleh berbagai faktor dan seringkali tidak terduga. Salah satu faktor yang mampu mempengaruhi harga saham adalah sentimen berita. Berita yang dimaksud dapat berupa berita yang berkaitan dengan perusahaan secara langsung, harga sumber daya, kondisi ekonomi negara, atau bahkan berkaitan dengan kondisi ekonomi dunia. Selain berita, faktor lain yang dapat memicu pergerakan harga saham pergerakan saham itu sendiri di masa lalu. Indikator teknikal merupakan salah satu alat yang sering diandalkan investor untuk memprediksi harga saham. Indikator teknikal dibentuk dengan menggunakan perhitungan matematis terhadap pergerakan harga saham.
Penelitian tugas akhir ini dilakukan untuk mengembangkan model klasifikasi yang mampu memprediksi strategi jual/beli saham dengan fusi data historis dan sentimen berita dengan menggunakan machine learning. Data historis dan berita berasal dari 10 emiten dalam kurun waktu 5 tahun. Indikator teknikal dibentuk dengan menggunakan data historis. Tugas akhir ini menggunakan 5 indikator teknikal yaitu Simple Moving Average (SMA), Exponential Moving Average (EMA), Relative Strength Index (RSI), Moving Average Convergence/Divergence (MACD), dan Bollinger Bands (BB). Sementara itu, data berita diambil, dibersihkan, dan diklasfikasikan menjadi sentimen positif, negatif, atau netral. Klasifikasi sentimen dilakukan dalam dua tahap, yaitu klasifikasi mandiri dengan bantuan lexicon dan pseudolabelling untuk seluruh data berita. Model pseudolabelling terbaik didapatkan menggunakan Support Vector Machine dengan 80% confidence level threshold.
Klasifikasi jual/beli dilakukan dengan menggunakan fusi data historis dan sentimen yang didalamnya terdapat indikator teknikal dan sentimen hasil pseudolabelling. Klasifikasi jual/beli dilakukan dengan menggunakan 4 model ensemble learning dan 2 model single learning. Model Random Forest, AdaBoost, XGBoost dan Voting digunakan sebagai model ensemble learning sedangkan Support Vector Machine dan K-Nearest Neighbors sebagai model single learning. Klasifikasi jual/beli terbaik dibangun menggunakan Principal Component Analysis (PCA) dengan 0.95 explained variance ratio, sliding window 10 hari, dan target hari 10. Model Voting yang menggunakan 2 model dengan Accuracy terbaik menjadi model klasifikasi jual/beli terbaik. Hasil klasifikasi jual/beli terbaik didapatkan menggunakan data emiten EXCL dengan hasil Accuracy 0.6292, Macro Precision 0.7211, dan Weighted Precision 0.7262.
============================================================
Every day stock prices always experience movements, both increases and decreases. Stock price movements are caused by various factors and are often unexpected. One factor that can affect stock prices is news sentiment. The news in question can be news related to the company directly, resource prices, country economic conditions, or even related to world economic conditions. In addition to news, another factor that can trigger stock price movements is the movement of the stock itself in the past. Technical indicators are one of the tools that investors often rely on to predict stock prices. Technical indicators are formed using mathematical calculations of stock price movements.
This final project research was conducted to develop a classification model that is able to predict stock buy/sell strategies by data fusion of historical and news sentiment using machine learning. Historical and news data come from 10 different companies within 5 years. Technical indicators are formed using historical data. This final project uses 5 technical indicators, namely Simple Moving Average (SMA), Exponential Moving Average (EMA), Relative Strength Index (RSI), Moving Average Convergence/Divergence (MACD), and Bollinger Bands (BB). Meanwhile, news data is taken, cleaned, and classified into positive, negative, or neutral sentiment. Sentiment classification was done in two stages, namely manual classification with the help of lexicon dictionary and pseudolabelling for all news data. The best pseudolabelling model was obtained using Support Vector Machine with 80% confidence level threshold.
Buy/sell classification was built using historical data fusion and sentiment which includes technical indicators and sentiment from pseudolabelling. Buy/sell classification was performed using 4 ensemble learning models and 2 single learning models. Random Forest, AdaBoost, XGBoost and Voting models were used as ensemble learning models while Support Vector Machine and K-Nearest Neighbors as single learning models. The best buy/sell classification was built using Principal Component Analysis (PCA) with 0.95 explained variance ratio, 10-day sliding window, and 10-day target. The Voting model that uses 2 models with the best accuracy becomes the best buy/sell classification model. The best buy/sell classification result was obtained using EXCL company data with the results of 0.6292 Accuracy, 0.7211 Macro Precision, and 0.7262 Weighted Precision.

Item Type: Thesis (Diploma)
Uncontrolled Keywords: Indikator Teknikal, Klasifikasi Jual/Beli, Machine Learning, Saham, Sentimen Berita, Buy/Sell Classification, Machine Learning, News Sentiment, Stock, Technical Indicators
Subjects: Q Science > Q Science (General) > Q325.5 Machine learning. Support vector machines.
Q Science > QA Mathematics > QA336 Artificial Intelligence
Q Science > QA Mathematics > QA76.9.D343 Data mining. Querying (Computer science)
Divisions: Faculty of Intelligent Electrical and Informatics Technology (ELECTICS) > Informatics Engineering > 55201-(S1) Undergraduate Thesis
Depositing User: Hamdika Abidjanna Zulfa
Date Deposited: 01 Aug 2024 06:24
Last Modified: 01 Aug 2024 06:24
URI: http://repository.its.ac.id/id/eprint/110623

Actions (login required)

View Item View Item