Analisis Mutasi pada Spike Protein SARS-CoV-2 dengan Multiple Sequence Alignment menggunakan Hidden Markov Model

Akib, Zulfiana S. (2025) Analisis Mutasi pada Spike Protein SARS-CoV-2 dengan Multiple Sequence Alignment menggunakan Hidden Markov Model. Masters thesis, Institut Teknologi Sepuluh Nopember.

[thumbnail of 6002201019-Master_Thesis.pdf] Text
6002201019-Master_Thesis.pdf - Accepted Version
Restricted to Repository staff only

Download (18MB) | Request a copy

Abstract

Pandemi COVID-19 disebabkan oleh virus SARs-CoV-2 telah melahirkan berbagai varian akibat proses mutasi genetik. Spike protein SARS-CoV-2 merupakan bagian penting yang mengalami banyak mutasi dan berperan dalam proses infeksi ke sel inang. Penelitian ini menganalisis pensejajaran dan kekerabatan varian SARS-CoV-2 menggunakan Multiple Sequence Alignment berbasis Hidden Markov Model (HMM) dengan algoritma Ba-Welch dan Viterbi. Hasil alignment menunjukkan bahwa HMM menghasilkan skor tertinggi 15.896.888 dibandingkan metode ClustalW dan Muscle. Pohon filogenetik dibangun dengan metode Neighbor Joining, menunjukkan kedekatan genetik antar varian. Jarak terjauh ditemukan antara varian Wuhan (Tiongkok) dan Omicron (Afrika Selatan) dengan nilai 0,035. Beberapa bagian spike protein seprti posisi 28-51 dan 614 teridentifikasi sebagai daerah konservatif antar varian. Temuan ini penting untuk memahami evolusi virus dan sebagai dasar dalam pengembangan vaksin yang efektif.
############################################################
The spike protein of SARS-CoV-2 plays a crucial role in the virus's ability to infect human cells and is known to undergo frequent mutations. This study aims to analyze the genetic relationships between SARS-CoV-2 variants based on spike protein data. Sequence alignment was performed using the Hidden Markov Model (HMM) method with Baum-Welch and Viterbi algorithms. The results showed that HMM produced the best alignment compared to ClustalW and Muscle, with the highest sum-of-pairs score of 15,896,888. A phylogenetic tree was then constructed using the Neighbor Joining method, revealing that the Wuhan variant (China) and the Omicron variant (South Africa) had the greatest genetic distance, with a value of 0.035. In addition, several parts of the
spike protein were found to remain unchanged across most variants,
referred to as conserved regions. These regions are useful for identifying
stable parts of the virus despite ongoing mutations and can support further
research on viral evolution and transmission.

Item Type: Thesis (Masters)
Uncontrolled Keywords: SARS-CoV-2, Spike Protein, Multiple Sequence Alignment, Hidden Markov Model, Neighbor Joining
Subjects: Q Science > QA Mathematics
Q Science > QA Mathematics > QA274.7 Markov processes--Mathematical models.
Q Science > QA Mathematics > QA76.9.D343 Data mining. Querying (Computer science)
Q Science > QA Mathematics > QA9.58 Algorithms
Divisions: Faculty of Science and Data Analytics (SCIENTICS) > Mathematics > 44101-(S2) Master Thesis
Depositing User: Zulfiana S. Akib
Date Deposited: 04 Aug 2025 08:52
Last Modified: 04 Aug 2025 08:52
URI: http://repository.its.ac.id/id/eprint/127000

Actions (login required)

View Item View Item