Everybody Can Change: Rocky IV

icon

14

pages

icon

English

icon

Documents

Écrit par

Publié par

Le téléchargement nécessite un accès à la bibliothèque YouScribe Tout savoir sur nos offres

icon

14

pages

icon

English

icon

Documents

Le téléchargement nécessite un accès à la bibliothèque YouScribe Tout savoir sur nos offres

  • exposé
  • cours - matière potentielle : football
  • cours - matière potentielle : at the college level
  • expression écrite
  • cours - matière potentielle : the following week
Everybody Can Change: A Critical Cinematic, Philosophical, Socio-Political, Theological Literary Analysis of Sylvester Stallone's Seminal Work, Rocky IV A Downloadable White Paper / E-Book Presented Free of Charge by Gut Check Press™ By Ted Kluck and Zachary Bartels August, 2010 gut check smackademic™
  • christ figure
  • paulie
  • fight
  • montage use
  • montage
  • rocky
  • true action movie fashion
  • birthday party scene at the beginning of the film
  • movie
  • film
Voir icon arrow

Publié par

Langue

English

Nguyen et al. BMC Genomics 2011, 12(Suppl 5):S4
http://www.biomedcentral.com/1471-2164/12/S5/S4
RESEARCH Open Access
Parallel progressive multiple sequence alignment
on reconfigurable meshes
1 2* 3Ken D Nguyen , Yi Pan , Ge Nong
From BIOCOMP 2010. The 2010 International Conference on Bioinformatics and Computational Biology
Las Vegas, NV, USA. 12-15 July 2010
Abstract
Background: One of the most fundamental and challenging tasks in bio-informatics is to identify related sequences
and their hidden biological significance. The most popular and proven best practice method to accomplish this task
is aligning multiple sequences together. However, multiple sequence alignment is a computing extensive task. In
addition, the advancement in DNA/RNA and Protein sequencing techniques has created a vast amount of sequences
to be analyzed that exceeding the capability of traditional computing models. Therefore, an effective parallel multiple
sequence alignment model capable of resolving these issues is in a great demand.
Results: We design O(1) run-time solutions for both local and global dynamic programming pair-wise alignment
algorithms on reconfigurable mesh computing model. To align m sequences with max length n, we combining
the parallel pair-wise dynamic programming solutions with newly designed parallel components. We successfully
4
reduce the progressive multiple sequence alignment algorithm’s run-time complexity from O(m × n)to O(m)
3
using O(m × n ) processing units for scoring schemes that use three distinct values for match/mismatch/gap-
4
extension. The general solution to multiple sequence alignment algorithm takes O(m × n ) processing units and
completes in O(m) time.
Conclusions: To our knowledge, this is the first time the progressive multiple sequence alignment algorithm is
completely parallelized with O(m) run-time. We also provide a new parallel algorithm for the Longest Common
3
Subsequence (LCS) with O(1) run-time using O(n ) processing units. This is a big improvement over the current
4best constant-time algorithm that uses O(n ) processing units.

Background scoring function h: × ×···× → ; and a gap
The advancement of DNA/RNA and protein sequencing cost function. Multiple sequence alignment is a technique
and sequence identification has created numerous data- to transform (s , s , ..., s ) to s ,s ,··· ,s , where sis s1 2 m i1 2 m i
bases of sequences. One of the most fundamental and
∪ ‘-’ [gap insertions], that optimizes the matching scores
challenging tasks in bio-informatics is to identify related
between the residues across all sequence columns [1].
sequences and their hidden biological significance.
However, multiple sequence alignment is an NP-Com-
Aligning multiple sequences together provides research-
plete problem [2]; therefore, it is often solved by heuris-
ers with one of the best solutions to this task. In gen-
tic techniques. Progressive multiple sequence alignment
eral, multiple sequence alignment can be defined as:
is one of the most popular multiple sequence alignment
Definition 1
techniques, in which the pair-wise symbol matching
Given: m sequences, (s,s ,..., s ), over an alphabet ∑,1 2 m scores can be derived from any scoring scheme or
where each sequence contains up to n symbols from ∑;a
obtained from a substitution scoring matrix such as
PAM [3] or BLOSUM [4]. There are many implementa-
* Correspondence: pan@cs.gsu.edu tions of progressive multiple sequence alignment as seen
2Department of Computer Science, Georgia State University, Atlanta, GA in [5-8]. In general, progressive multiple sequence align-
30303, USA
ment algorithm follows three steps:Full list of author information is available at the end of the article
© 2011 Nguyen et al. licensee BioMed Central Ltd This is an open access article distributed under the terms of the Creative Commons
Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in
any medium, provided the original work is properly cited.
Nguyen et al. BMC Genomics 2011, 12(Suppl 5):S4 Page 2 of 14
http://www.biomedcentral.com/1471-2164/12/S5/S4
(i) Perform all pair-wise alignments of the input To generate a dendrogram from the distances between
sequences. the sequences (or the scores generated from step (i)),
(ii) Compute a dendrogram indicating the order in either UPGMA [11] or Neighbor Joining (NJ) [12] hier-
3
which the sequences to be aligned. archical clustering is used. These algorithms yield O(m )
(iii) Pair-wise align two sequences (or two pre-aligned run-time complexity.
groups of sequences) following the dendrogram starting In the worst case, step (iii) performs (m-1)pair-wise
from the leaves to the root of the dendrogram. alignments in-order following the dendrogram hierarchy.
Figure 1 shows an example of these steps, where (a) Similar to step (i), dynamic programming for pair-wise
represents the input sequences, (b) represents an align- alignment is used, however, each of these pair-wise
4ment of step (i), (c) shows the dendrogram obtained group alignment yields an order of O(n ) via dynamic
2from step (ii), and (d) shows a pair-wise group-align- programming (O(n )) and sum-of-pair scoring function
2ment in step (iii). [13](O(n )). This scoring function is required to evaluate
Step (i) can be optimally solve by Dynamic Program- everyallpossibleresiduematchings of the sequences.
ming (DP) algorithm. There are two versions of DP: the Asaresult,therun-timecomplexityofstep(iii)is O(m
4 5Smith-Waterman’s [9] is used to find the optimally ×n ) ≈ O(n ), which is the overall run-time complexity
aligned segment between two sequences (local DP), and of progressive multiple sequence alignment algorithm.
the Needleman-Wunsch’s [10] is used to find the global
optimal overall sequence pair-wise alignment (global Optimal pair-wise sequence alignment by dynamic
DP). The two algorithms are very similar and will be programming
described in more details in the next section. The Given two sequences x and y each contains up to n resi-
2dynamic programming algorithms take O(n)timeto due symbols. The optimal alignment of these sequences
complete, including the back-tracking steps. Thus, with can be found by calculating an (n +1)×(n +1)
m(m−1) unique pairs of the input sequences, the run-time dynamic programming (DP) matrix containing all possi-
2
2 2 4complexity of step (i) is O(m n)or O(n)if n and m ble pair-wise matching scores of residue symbols in the
are asymptotically equivalent. sequences. Initially, the first row and column of the
matrix cells are set to 0, i.e.
c =0,0,j
c =0.i,0
The recursive formula to compute the DP matrix for
the Longest Common Subsequence (LCS) as seen in
[14] is:

c +1 if x = yi−1,j−1 i jc =i,j max{(c ),(c )} if x = yi−1,j i,j−1 i j
Similarly, the Needleman-Wunsch’s algorithm [10]
uses the following formula to complete the DP matrix:

c +s(x ,y) symbol matching⎨ i−1,j−1 i j
c = max c +g gap insertioni,j i−1,j

c +g gap insertioni,j−1
where s(x, y) is the pair-wise symbol matching scorei j
of the two symbols x and y from sequences x and y,i j
respectively; and g is the gap cost for extending aFigure 1 A progressive multiple sequence alignment.An
example of progressive Multiple Sequence Alignment. (a) represents sequence by inserting a gap, i.e. gap insertion/deletion
three input sequences (S1, S2, S3); (b) shows the pair-wise dynamic (indel).
programming alignment of two sequences; (c) shows the order of
Smith and Waterman [9] modified the above formula as:
the sequences to be aligned, where the leaves on right hand-side
⎧are the input sequences, the internal nodes represent the
0⎪theoretical ancestors from which the sequences are derived, and ⎨
c +s(x ,y) symbol matchingi−1,j−1 i jthe characters on the tree branches represent the missing/mutated c = maxi,j
⎪c +g gap insertionresidues; and (d) shows the pair-wise dynamic programming of two i−1,j⎩
pre-aligned groups of sequences. c +g gap insertioni,j−1Nguyen et al. BMC Genomics 2011, 12(Suppl 5):S4 Page 3 of 14
http://www.biomedcentral.com/1471-2164/12/S5/S4
The alignment can be obtained from the DP matrix by Furthermore, there are attempts to parallelize the pro-
starting from cell c , (or the cell containing the max gressive alignment step [step (iii)] as in [28] and [29]. Inn, n
value in the matrix as in the Smith-Waterman’salgo- [28], the independent pre-aligned pairs along the den-
rithm), and tracking back to the top of the matrix, i.e. drogram are aligned simultaneously. This technique
cell c , by following neighboring cells with the largest gains some speed-up, however, the time complexity of0,0
value. the algorithm remains unchanged since all the pair-wise
alignments eventually must be merged together.
Existing parallel implementations Another attempt is seen in [29], where Huang’salgo-
Progressive multiple sequence alignment algorithms are rithm [25] is used. When an anti-diagonal of a DP align-
m(m−1)widely parallelized, mostly because they perform ment matrix in lower tree level in step (iii) is completed,
2
independent pair-wise alignments as in step (i). These it is distributed immediately to other processors for
individual pair-wise alignments can be designated to dif- computing the

Voir icon more
Alternate Text