Alcohol Dehydrogenase Pepsin Click on molecules for description Ribosome Rubisco
Molecules and descriptions are from PDB

Bioinformatics and Computational Molecular Biology

Algorithms and Hardware

The Smith-Waterman Alignment Algorithm Equations

 

The equations of the Smith-Waterman sequence alignment algorithm are shown below. This algorithm is used to place two portions of character stings next to each other such that the portions are most similar. The two portions may each include one or more gaps. There is a penalty for starting a new gap and a (usually smaller) penalty for continuing a gap. It is not necessary that the characters at a particular pairing location match exactly. There is a substitution matrix specified which shows which substitutions are good and which are bad. Good substitutions (which include an exact match) get a reward and bad substitutions get a penalty.

 

The Equations:

I(i,j) = D(i,j) = M(i,j) = 0; for all i and j such that i = 0 or j = 0

I(i,j) = max[I(i-1,j) - c, M(i-1,j) - g]

D(i,j) = max[D(i,j-1) - c, M(i,j-1) - g]

M(i,j) = max[I(i-1,j-1) + d, D(i-1,j-1) + d, M(i-1,j-1) + d, 0]

d is a function of the character at i and the character at j in the two strings to be aligned.

 

Meanings of the Variables:

I(i,j) == score when aligning positions i and j when inserting.

D(i,j) == score when aligning positions i and j when deleting.

M(i,j) == score when aligning positions i and j when matching/mutating.

c == penalty for continuing a gap.

g == penalty for starting a new gap.

d == reward (if positive) or penalty (if negative) for matching/mutating.

 

Note: The equations are symmetic, so it does not matter which string is assigned to i and which is assigned to j. An insertion in one string is interpreted as a deletion in the other an vice versa.

Reference: T. Smith and M. Waterman, "Identification of Common Molecular Sequences," J. Molecular Biology, 147, pp. 195-197, 1981.

 

Boise State University College of Engineering

Boise State University Department of Electrical and Computer Engineering

This page created by Dr. Scott F. Smith

This page was last updated on 11 March 2004.