Thursday, October 28, 2010

Dynamic Programming

I am an individual who when approaching a problem prefers to have a well defined procedure for solving it. Over the years of exposure to Algorithms I have come up with the following method for approaching a Dynamic Programming problem.

Dynamic Programming Approach

Step 1: Describe an array(s) of values you will want to compute. This is usually a property of the optimal alignment and the most difficult step in Dynamic programming. If you come up with a good definition the rest follows naturally. Also, as a hint start by writing down the first thing that comes to you, surprisingly it is often right.

Step 2: Provide a recurrence relation for computing the array described in step 1. In other words, define some entry in your array in terms of previous entries

Step 3: Write an algorithm for computing the array of step 1 using the recurrence relation of step 2

Step 4: Write an algorithm to recover an optimal solution from the array in step 1. This algorithm generally consists of backtracking through the array starting with the value that defines the optimal solution.

Example

For example let us consider the Needleman - Wunsch Algorithm for global sequence alignment.

Problem Statement: Given two sequences S and T where |S| = n and |T| = m. Find a O(m*n) algorithm which computes the optimal alignment between S and T subject to the scoring function p(a,b). p(a,b) is the cost of aligning symbol a to symbol b. An optimal alignment has a maximum possible score.

Step 1- Define an Array:
let V(i,j) be the optimal score of the alignment between S[0,i] and T[0,j]

Step 2 - Write a Recurrence Relation:
V(i,j) <- max{
V(i-1, j-1) + p(S[i], T[j]), // align symbol S[i] to symbol T[j]
V(i-1,j) + p(S[i],_), // align symbol S[i] to a gap, _
V(i,j-1) + p(_,T[j]), // align symbol T[j] to a gap, _
}
V(n,m) contains the score of the optimal global alignment of S and T

Step 3 - Write an Algorithm for Computing the Array
:
// begin with initialization of the array
V(0,0) <- 0
for i <- 1 to n
V(i,0) <- V(i-1,0) + p(S[i],_);
for j <- 1 to m
V(0,j) <- V(0,j-1) + p(_,T[j]);
// compute remainder of the array using the recurrence relation
for i <- 1 to n
for j <- 1 to m
V(i,j) <- max{
V(i-1, j-1) + p(S[i], T[j]), // align symbol S[i] to symbol T[j]
V(i-1,j) + p(S[i],_), // align symbol S[i] to a gap, _
V(i,j-1) + p(_,T[j]), // align symbol T[j] to a gap, _
}

Step 4 - Write an Algorithm to Recover the Optimal Solution
:
i = n
j = m
reconstructed_S = ""
reconstructed_T = ""
while i > 1 & j > 1
if (V(i,j) == V(i-1, j-1) + p(S[i], T[j]))
reconstructed_S <- S[i] + reconstructed_S
reconstructed_T <- T[i] + reconstructed_T
i--
j--
else if (V(i,j) == V(i-1,j) + p(S[i],_))
reconstructed_S <- S[i] + reconstructed_S
reconstructed_T <- _ + reconstructed_T
i--
else if (V(i,j) == V(i,j-1) + p(_,T[j]))
reconstructed_T <- T[i] + reconstructed_T
reconstructed_S <- _ + reconstructed_S
j--

For more information on the Needleman - Wunsch Algorithm:
http://en.wikipedia.org/wiki/Needleman%E2%80%93Wunsch_algorithm
http://www.avatar.se/molbioinfo2001/dynprog/dynamic.html

No comments:

Post a Comment