Akib, Zulfiana S. (2025) Analisis Mutasi pada Spike Protein SARS-CoV-2 dengan Multiple Sequence Alignment menggunakan Hidden Markov Model. Masters thesis, Institut Teknologi Sepuluh Nopember.
![]() |
Text
6002201019-Master_Thesis.pdf - Accepted Version Restricted to Repository staff only Download (18MB) | Request a copy |
Abstract
Pandemi COVID-19 disebabkan oleh virus SARs-CoV-2 telah melahirkan berbagai varian akibat proses mutasi genetik. Spike protein SARS-CoV-2 merupakan bagian penting yang mengalami banyak mutasi dan berperan dalam proses infeksi ke sel inang. Penelitian ini menganalisis pensejajaran dan kekerabatan varian SARS-CoV-2 menggunakan Multiple Sequence Alignment berbasis Hidden Markov Model (HMM) dengan algoritma Ba-Welch dan Viterbi. Hasil alignment menunjukkan bahwa HMM menghasilkan skor tertinggi 15.896.888 dibandingkan metode ClustalW dan Muscle. Pohon filogenetik dibangun dengan metode Neighbor Joining, menunjukkan kedekatan genetik antar varian. Jarak terjauh ditemukan antara varian Wuhan (Tiongkok) dan Omicron (Afrika Selatan) dengan nilai 0,035. Beberapa bagian spike protein seprti posisi 28-51 dan 614 teridentifikasi sebagai daerah konservatif antar varian. Temuan ini penting untuk memahami evolusi virus dan sebagai dasar dalam pengembangan vaksin yang efektif.
############################################################
The spike protein of SARS-CoV-2 plays a crucial role in the virus's ability to infect human cells and is known to undergo frequent mutations. This study aims to analyze the genetic relationships between SARS-CoV-2 variants based on spike protein data. Sequence alignment was performed using the Hidden Markov Model (HMM) method with Baum-Welch and Viterbi algorithms. The results showed that HMM produced the best alignment compared to ClustalW and Muscle, with the highest sum-of-pairs score of 15,896,888. A phylogenetic tree was then constructed using the Neighbor Joining method, revealing that the Wuhan variant (China) and the Omicron variant (South Africa) had the greatest genetic distance, with a value of 0.035. In addition, several parts of the
spike protein were found to remain unchanged across most variants,
referred to as conserved regions. These regions are useful for identifying
stable parts of the virus despite ongoing mutations and can support further
research on viral evolution and transmission.
Item Type: | Thesis (Masters) |
---|---|
Uncontrolled Keywords: | SARS-CoV-2, Spike Protein, Multiple Sequence Alignment, Hidden Markov Model, Neighbor Joining |
Subjects: | Q Science > QA Mathematics Q Science > QA Mathematics > QA274.7 Markov processes--Mathematical models. Q Science > QA Mathematics > QA76.9.D343 Data mining. Querying (Computer science) Q Science > QA Mathematics > QA9.58 Algorithms |
Divisions: | Faculty of Science and Data Analytics (SCIENTICS) > Mathematics > 44101-(S2) Master Thesis |
Depositing User: | Zulfiana S. Akib |
Date Deposited: | 04 Aug 2025 08:52 |
Last Modified: | 04 Aug 2025 08:52 |
URI: | http://repository.its.ac.id/id/eprint/127000 |
Actions (login required)
![]() |
View Item |