Design of Pipelined Architecture for Biological Sequence Alignment
Loading...
Date
item.page.authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Sequence alignment is crucial for genome analysis because it allows us to identify similarities in biological sequences. Sequence alignment techniques are used to compare newly generated sequences to reference sequences, finding mutations, genetic variants, and other significant features. The thesis first identifies pair-wise sequence alignment as the primary method of alignment, dominating bioinformatics analysis for the massive production of genome data; the huge volume of data demands not only prevent efficiency but also raise high run-time memory. For that, sequence alignment needs acceleration to process in an accurate, high-speed, and efficient system to employ a computationally intensive dynamic programming paradigm. We suggest four key works to accomplish these objectives.
newline
newlineA memory-efficient architecture for computationally expensive pair-wise global alignment techniques has been proposed to achieve high throughput by employing the score matrix with O(n) space complexity instead of the traditional O(jn) matrix, where j-nucleotide in a query and n-nucleotide in a reference sequence. Next, we present coarse-grained pipeline scheme that has been embedded to accomplish temporal parallelism and high throughput (and#8733; n) with the concept of previously memory-efficient architecture to prepare a direction array for multiple sequences.
newline
newlineTo improve the performance in terms of throughput and power efficiency, a pipeline multiprocessor architecture for pair-wise sequence alignment has been developed. A noble optimised trace-back engine has been created and integrated with newly introduced multiple small register files to prevent memory bandwidth issues and reduce data transfer overhead, run-time storage needs, and power consumption. Lastly, A hierarchical multi-processing architecture has been developed to support alignment operations of multiple sequences. In addition, parallel inter-sequence computation and intra-sequence trace-back value creation maximise processing element utilisation. Furthermore, it can accomplish