Multiple DNA/RNA sequence alignment and phylogenetic tree-building programme called SaAlign for extremely huge datasets and extremely lengthy sequences

Jung Cook

doi:10.14303/2250-9941.2022.37

International Research Journal of Biochemistry and Bioinformatics

All submissions of the EM system will be redirected to Online Manuscript Submission System. Authors are requested to submit articles directly to Online Manuscript Submission System of respective journal.

Multiple DNA/RNA sequence alignment and phylogenetic tree-building programme called SaAlign for extremely huge datasets and extremely lengthy sequences

Abstract

Jung Cook*

Multiple DNA/RNA sequence alignment is a crucial bioinformatics foundational technique, particularly for the creation of phylogenetic trees. The volume of bioinformatics data is continuously growing as a result of advancements in DNA-sequencing, necessitating the continuous iteration of several tools. Bioinformatics software is needed to analyse the mitochondrial genomes of various people and species, thus its performance has to be improved. We used longest common substring techniques to optimise a dynamic programming solution for the alignment of extremely big datasets and extremely lengthy sequences (Liou et al., 2013). The Multiple DNA/ RNA Sequence Alignment Tool Based on Suffix Tree (SaAlign), which aligns sequences of diverse lengths, some exceeding 300 kb (kilobases), was shown to save time and computing space on extremely large test DNA datasets. It performed better than the available technical instruments, such as MAFFT and HAlign-II. MAFFT completed the necessary tasks for mitochondrial genome datasets with a small number of sequences; however it was unable to handle extremely large mitochondrial genome datasets due to core dump errors. In order to maximise the spatial and temporal efficiency, we construct a multiple DNA/RNA sequence alignment tool based on the Center Star technique and apply the suffix array algorithm (Siniscalco et al., 2008). These days, as whole-genome research and NGS technologies gain traction, it is important to save computing resources for labs. This programme is extremely important in these areas, particularly for the study of plants' whole mitochondrial genome (Tzouvelekis et al., 2013).

Share this article