Google Answers Logo
View Question
 
Q: Dynamic Programming Algorithms for Pairwise Alignment with Affine Gap Scores ( No Answer,   0 Comments )
Question  
Subject: Dynamic Programming Algorithms for Pairwise Alignment with Affine Gap Scores
Category: Computers > Algorithms
Asked by: funnyman3595-ga
List Price: $2.00
Posted: 22 Apr 2004 16:01 PDT
Expires: 22 May 2004 16:01 PDT
Question ID: 334550
I have a 1998 book[1] which describes a dynamic programming algorithm
for generating a pairwise alignment using affine gap scores.  The
method, outlined below, is from a 1982 journal article[2].  In
particular, I am interested in the application of this method to
finding alignments of amino acid sequences.  Is this method is still
in common usage?  If it is still used, what, if any, improvements have
been made upon the original technique?  If it is not still used, what
has replaced it?


The algorithm takes as input two strings, representing the sequences
to be compared, a scoring matrix, representing the scores for any
possible pairing of two characters, and two numbers, representing the
penalties for beginning and extending a gap.  The particular usage I
am dealing with typically uses BLOSUM50 as the matrix, 12 as the
penalty for beginning a gap, and 2 as the penalty for extending a gap.
 The strings are letters (case independant) which are the common
single-letter abbreviations of amino acids.

Let m be the length of the first sequence, n be the length of the
second sequence, d be the gap open penalty (12 above), e be the gap
extention penalty (2 above), and s(i,j) be the value of the scoring
matrix corresponding to the ith letter in the first sequence and the
jth letter in the second sequence.

The algorithm uses three matricies, denoted M(i,j),Ix(i,j), and
Iy(i,j).  It recursively builds the matricies row by row like so:
M(i,j)=max(M(i-1,j-1), Ix(i-1,j-1),Iy(i-1,j-1))+s(i,j)
Ix(i,j)=max(M(i-1,j)-d, Ix(i-1,j)-e)
Iy(i,j)=max(M(i,j-1)-d, Iy(i,j-1)-e)
It also keeps a pointer along with each matrix value indicating which
of the terms in the max it chose.

The initial values M(i,0), M(0,j), Ix(i,0) and so on depend on the
type of alignment required, as does the point you choose to build the
alignment from using the matrices.  For the simplest case, overlap
alignment, we set all of the initial values to 0 and we pick the point
to be the largest value of one of the forms M(i,n), M(m,j),Ix(i,n),
and so on.  In other words, we pick largest value on the bottom or
right edge of any of the three matricies.

From there, the algorithm simply traces back the pointers to build the
sequence.  An entry in the M matrix represents a pairing, an entry in
one of the I matrices represents a gap.  The final output has two
pieces, the final score and the alignment.  The alignment looks
something like this:
VLSPAD-K
HL--AESK
where -'s represent gaps in one sequence.

The algorithm can be simplified somewhat by combining the Ix and Iy
matricies into a single matrix I using I(i,j)=max(M(i-1,j)-d,
I(i-1,j)-e,M(i,j-1)-d, I(i,j-1)-e), at the cost of a small potential
of error if any value in the scoring matrix is <-2e.

The original algorithm is O(n^2) and requires memory space on the
order of 3mn.  The simplified algorithm is also O(n^2) but requires
only memory space on the order of 2mn.  Have these limits been passed?

[1] Durbin, Richard, et. al.  Biological sequence analysis -
Probababilistic models of sequence analysis.  Cambridge University
Press, 1998.
[2] Gotoh, O.  An improved algorighm for matching biological
sequences.  1982.  Journal of Molecular Biology 162:705-708.

Request for Question Clarification by mathtalk-ga on 23 Apr 2004 19:09 PDT
Hi, funnyman3595-ga:

Reviews favorable to the fairly introductory treatment of pairwise
alignment algorithms by Durbin et al often recommend:

D. M. Gusfield. Algorithms on Strings, Trees, and Sequences. Addison Wesley, 1998.

for its more rigorous and in depth analysis of complexity issues.

regards, mathtalk-ga

Clarification of Question by funnyman3595-ga on 24 Apr 2004 09:52 PDT
I am attempting to get ahold of a related resource to the one you
recommended, setting the question price down to the minimum until I
get a chance to look at it.

-FM
Answer  
There is no answer at this time.

Comments  
There are no comments at this time.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy