Overdetermined system: Difference between revisions

Latest revision as of 14:30, 7 January 2015

Hello! Let me begin by stating my name - Ron Stephenson. Years ago we moved to Kansas. The preferred hobby for my children and me is playing crochet and now I'm attempting to earn money with it. I am a production and distribution officer.

Also visit my site :: http://Bikedance.com

@@ Line 1: / Line 1: @@
-In [[computer science]], '''Hirschberg's algorithm''', named after its inventor, [[Dan Hirschberg]], is a [[dynamic programming]] [[algorithm]] that finds the optimal [[sequence alignment]] between two [[string (computer science)|string]]s. Optimality is measured with the [[Levenshtein distance]], defined to be the sum of the costs of insertions, replacements, deletions, and null actions needed to change one string into the other.  Hirschberg's algorithm is simply described as a [[divide and conquer algorithm|divide and conquer]] version of the [[Needleman&ndash;Wunsch algorithm]].<ref>[http://www.csse.monash.edu.au/~lloyd/tildeAlgDS/Dynamic/Hirsch/ Hirschberg's algorithm<!-- Bot generated title -->]</ref>  Hirschberg's algorithm is commonly used in [[computational biology]] to find maximal global alignments of [[DNA]] and [[protein]] sequences.
+Hello! Let me begin by stating my name - Ron Stephenson. Years ago we moved to Kansas. The preferred hobby for my children and me is playing crochet and now I'm attempting to earn money with it. I am a production and distribution officer.<br><br>Also visit my site :: [http://Bikedance.com/blogs/post/29704 http://Bikedance.com]
-==Algorithm information==
-Hirschberg's algorithm is a generally applicable algorithm for optimal sequence alignment. [[BLAST]] and [[FASTA]] are suboptimal [[Heuristic (computer science)|heuristics]].  If ''x'' and ''y'' are strings, where length(''x'') = ''n'' and length(''y'') = ''m'', the [[Needleman-Wunsch algorithm]] finds an optimal alignment in [[Big O Notation|O]](''nm'') time, using O(''nm'') space.  Hirschberg's algorithm is a clever modification of the Needleman-Wunsch Algorithm which still takes O(''nm'') time, but needs only O(min{''n'',''m''}) space.<ref>http://www.cs.tau.ac.il/~rshamir/algmb/98/scribe/html/lec02/node10.html</ref>
-One application of the algorithm is finding sequence alignments of DNA or protein sequences.  It is also a space-efficient way to calculate the [[longest common subsequence problem|longest common subsequence]] between two sets of data such as with the common [[diff]] tool.
-The Hirschberg algorithm can be derived from the Needleman-Wunsch algorithm by observing that:<ref>{{cite journal|author=Hirschberg, D. S.|title=A linear space algorithm for computing maximal common subsequences|journal=Communications of the ACM|volume=18|issue=6|year=1975|pages=341–343|doi=10.1145/360825.360861}}</ref>
-# one can compute the optimal alignment score by only storing the current and previous row of the Needleman-Wunsch score matrix;
-# if <math>(Z,W) = \operatorname{NW}(X,Y)</math> is the optimal alignment of <math>(X,Y)</math>, and <math>X = X^l + X^r</math> is an arbitrary partition of <math>X</math>, there exists a partition <math>Y^l + Y^r</math> of <math>Y</math> such that <math>\operatorname{NW}(X,Y) = \operatorname{NW}(X^l,Y^l) + \operatorname{NW}(X^r,Y^r)</math>.
-== Algorithm description ==
-<math>X_i</math> denotes the i-th character of <math>X</math>, where <math>1 < i \leqslant \operatorname{length}(X)</math>. <math>X_{i:j}</math> denotes a substring of size <math>j-i+1</math>, ranging from i-th to the j-th character of <math>X</math>. <math>\operatorname{rev}(X)</math> is the reversed version of <math>X</math>.
-<math>X</math> and <math>Y</math> are sequences to be aligned. Let <math>x</math> be a character from <math>X</math>, and <math>y</math> be a character from <math>Y</math>. We assume that <math>\operatorname{Del}(x)</math>, <math>\operatorname{Ins}(y)</math> and <math>\operatorname{Sub}(x,y)</math> are well defined integer-valued functions. These functions represent the cost of deleting <math>x</math>, inserting <math>y</math>, and replacing <math>x</math> with <math>y</math>, respectively.
-We define <math>\operatorname{NWScore}(X,Y)</math>, which returns the last line of the Needleman-Wunsch score matrix <math>\mathrm{Score}(i,j)</math>:
-  '''function''' NWScore(X,Y)
-    Score(0,0) = 0
-    '''for''' j=1 '''to''' length(Y)
-      Score(0,j) = Score(0,j-1) + Ins(Y<sub>j</sub>)
-    '''for''' i=1 '''to''' length(X)
-      Score(i,0) = Score(i-1,0) + Del(X<sub>i</sub>)
-      '''for''' j=1 '''to''' length(Y)
-        scoreSub = Score(i-1,j-1) + Sub(X<sub>i</sub>, Y<sub>j</sub>)
-        scoreDel = Score(i-1,j) + Del(X<sub>i</sub>)
-        scoreIns = Score(i,j-1) + Ins(Y<sub>j</sub>)
-        Score(i,j) = max(scoreSub, scoreDel, scoreIns)
-      '''end'''
-    '''end'''
-    '''for''' j=0 '''to''' length(Y)
-      LastLine(j) = Score(length(X),j)
-    '''return''' LastLine
-Note that at any point, <math>\operatorname{NWScore}</math> only requires the two most recent rows of the score matrix. Thus, <math>\operatorname{NWScore}</math> can be implemented in <math>O(\operatorname{min}\{\operatorname{length}(X),\operatorname{length}(Y)\})</math> space.
-The Hirschberg algorithm follows:
-  '''function''' Hirschberg(X,Y)
-    Z = ""
-    W = ""
-    '''if''' length(X) == 0 '''or''' length(Y) == 0
-      '''if''' length(X) == 0
-        '''for''' i=1 '''to''' length(Y)
-          Z = Z + '-'
-          W = W + Y<sub>i</sub>
-        '''end'''
-      '''else if''' length(Y) == 0
-        '''for''' i=1 '''to''' length(X)
-          Z = Z + X<sub>i</sub>
-          W = W + '-'
-        '''end'''
-      '''end'''
-    '''else if''' length(X) == 1 '''or''' length(Y) == 1
-      (Z,W) = NeedlemanWunsch(X,Y)
-    '''else'''
-      xlen = length(X)
-      xmid = length(X)/2
-      ylen = length(Y)
-      ScoreL = NWScore(X<sub>1:xmid</sub>, Y)
-      ScoreR = NWScore(rev(X<sub>xmid+1:xlen</sub>), rev(Y))
-      ymid = PartitionY(ScoreL, ScoreR)
-      (Z,W) = Hirschberg(X<sub>1:xmid</sub>, y<sub>1:ymid</sub>) + Hirschberg(X<sub>xmid+1:xlen</sub>, Y<sub>ymid+1:ylen</sub>)
-    '''end'''
-    '''return''' (Z,W)
-In the context of Observation (2), assume that <math>X^l + X^r</math> is a partition of <math>X</math>. Function <math>\mathrm{PartitionY}</math> returns index <math>\mathrm{ymid}</math> such that <math>Y^l = Y_{1:\mathrm{ymid}}</math> and <math>Y^r = Y_{\mathrm{ymid}+1:\operatorname{length}(Y)}</math>. <math>\mathrm{PartitionY}</math> is given by
-  '''function''' PartitionY(ScoreL, ScoreR)
-    '''return''' [[arg max]] ScoreL + rev(ScoreR)
-== Example ==
-Let
-<math>
-  \begin{align}
-    X &= \mathrm{AGTACGCA},\\
-    Y &= \mathrm{TATGC},\\
-    \operatorname{Del}(x) &= -2,\\
-    \operatorname{Ins}(y) &= -2,\\
-    \operatorname{Sub}(x,y) &= \begin{cases} +2, & \mbox{if } x = y \\ -1, & \mbox{if } x \neq y.\end{cases}
-  \end{align}
-</math>.
-The optimal alignment is given by
-  W = AGTACGCA
-  Z = --TATGC-
-Indeed, this can be verified by backtracking its corresponding Needleman-Wunsch matrix:
-          '''T   A   T   G   C'''
-      '''0'''  -2  -4  -6  -8 -10
-  '''A'''  '''-2'''  -1   0  -2  -4  -6
-  '''G'''  '''-4'''  -3  -2  -1   0  -2
-  '''T'''  -6  '''-2'''  -4   0  -2  -1
-  '''A'''  -8  -4   '''0'''  -2  -1  -3
-  '''C''' -10  -6  -2  '''-1'''  -3   1
-  '''G''' -12  -8  -4  -3   '''1'''  -1
-  '''C''' -14 -10  -6  -5  -1   '''3'''
-  '''A''' -16 -12  -8  -7  -3   '''1'''
-One starts with the top level call to <math>\operatorname{Hirschberg}(\mathrm{AGTACGCA}, \mathrm{TATGC})</math>. The call to <math>\operatorname{NWScore}(\mathrm{AGTA},Y)</math> produces the following matrix:
-         '''T   A   T   G   C'''
-  -2  -4  -6  -8 -10
-  '''A''' -2  -1   0  -2  -4  -6
-  '''G''' -4  -3  -2  -1   0  -2
-  '''T''' -6  -2  -4   0  -2  -1
-  '''A''' -8  -4   0  -2  -1  -3
-Likewise, <math>\operatorname{NWScore}(\operatorname{rev}(\mathrm{CGCA}), \operatorname{rev}(Y))</math> generates the following matrix:
-        '''C   G   T   A   T'''
--2  -4  -6  -8 -10
-  '''A''' -2 -1  -3  -5  -4  -6
-  '''C''' -4  0  -2  -4  -6  -5
-  '''G''' -6 -2   2   0  -2  -4
-  '''C''' -8 -4   0   1  -1  -3
-Their last lines are respectively
-  ScoreL = [ -8 -4  0 -2 -1 -3 ]
-  ScoreR = [ -8 -4  0  1 -1 -3 ]
-<tt>PartitionY(ScoreL, ScoreR) = 2</tt>, such that <math>X = \mathrm{AGTA} + \mathrm{CGCA}</math> and <math>Y = \mathrm{TA} + \mathrm{TGC}</math>.
-The entire Hirschberg recursion (which we omit for brevity) produces the following tree:
-                (AGTACGCA,TATGC)
-                /              \
-         (AGTA,TA)            (CGCA,TGC)
-          /     \              /      \
-      (AG,)   (TA,TA)      (CG,TG)  (CA,C)
-               /   \        /   \
-            (T,T) (A,A)  (C,T) (G,G)
-The leaves of the tree contain the optimal alignment.
-==See also==
-* [[Needleman-Wunsch algorithm]]
-* [[Smith Waterman algorithm]]
-* [[Levenshtein distance]]
-* [[Longest common subsequence problem|Longest Common Subsequence]]
-==References==
-{{reflist}}
-{{DEFAULTSORT:Hirschberg's Algorithm}}
-[[Category:Sequence alignment algorithms]]
-[[Category:Bioinformatics algorithms]]
-[[Category:Articles with example pseudocode]]
-[[Category:Dynamic programming]]

Overdetermined system: Difference between revisions

Latest revision as of 14:30, 7 January 2015

Navigation menu

Search